[Bug] gcc bug affecting OCC fundamentals

It sounds as brutal as unbelievable but I can only attribute this to a gcc bug. If anyone has other explanation/recommendation that would be interesting and valuable.
Here is the simplest test case:

#include

void TestMat()
{
gp_Mat M0;
cerr }

Default constructor of gp_Mat initializes the internal 3x3 matrix with 0 (see gp_Mat.lxx), so is expected output. This is really the case on all configurations:
- Visual Studio 2005, 2008 on Windows
- gcc 4.0.1 -m32 with both -O2 and -O0 on MacOS 10.5 64bit
- gcc 4.1.2 -m32 with -O0 on Linux 64bit

However, gcc 4.1.2 with -O2 -m32 produces weird results leaving 2 of 3 matrix rows (i.e. elements (2,1)-(3,3)) uninitialized. I came to this observation during digging test cases in CAD Exchanger failing on Linux in release mode (-O2).

Looking at gp_Mat.lxx we see the following:

#define Mat00 ((Standard_Real*)M)[0]
#define Mat01 ((Standard_Real*)M)[1]
[…]
#define Mat22 ((Standard_Real*)M)[8]

inline gp_Mat::gp_Mat () {
const Standard_Address M = (Standard_Address)&(matrix[0][0]);
Mat00 =
Mat01 =
Mat02 =
Mat10 =
Mat11 =
Mat12 =
Mat20 =
Mat21 =
Mat22 = 0.0;
}

However during compilation of the above test here is what gcc tells:
../src/CadExTestLib/CadExTestLib_OCC.cxx:84: warning: M.gp_Mat::matrix[1][0] is used uninitialized in this function
../src/CadExTestLib/CadExTestLib_OCC.cxx:84: warning: M.gp_Mat::matrix[1][1] is used uninitialized in this function
../src/CadExTestLib/CadExTestLib_OCC.cxx:84: warning: M.gp_Mat::matrix[1][2] is used uninitialized in this function
../src/CadExTestLib/CadExTestLib_OCC.cxx:84: warning: M.gp_Mat::matrix[2][0] is used uninitialized in this function
../src/CadExTestLib/CadExTestLib_OCC.cxx:84: warning: M.gp_Mat::matrix[2][1] is used uninitialized in this function
../src/CadExTestLib/CadExTestLib_OCC.cxx:84: warning: M.gp_Mat::matrix[2][2] is used uninitialized in this function

Thus, gcc has inlined the constructor body into the test case and now complains that some memory left uninitialized ! Weird, isn’t it ?
It looks like it could not resolve indirect references to memory via &matrix[0][0] and then access M[3]…M[8] and just ignored them. This looks more than dangerous (as compiler’s optimization must not result in wrong memory contents), so I am not fully certain if this is really the case, so any hints would be appreciated. However due to this flaw several valid test cases failed on Linux, built with -O2 mode. So I would recommend that OCC team ensures that they have similar test cases in the regression test suite.

So, I ended up fixing this by using direct references to memory.
#define Mat00 matrix[0][0]
...
#define Mat22 matrix[2][2]

The fix is attached.

Hope this helps.
Roman

Attachments: 
Mark's picture

Hi Roman

Are you sure you're using -m32 on a 64-bit OS? When I tried that (with g++ 4.4) I got a missing header error. Same for using -m64 on a 32-bit install with same gcc version.

I don't have gcc 4.1 installed anywhere, but gcc (and g++) v 4.4.3 and 4.4.5 all produce an executable which outputs
"M0= {{0, 0, 0}, {0, 0, 0}, {0, 0, 0}}". Use of -m32 or -m64 in combination with -O0 or -O2 did not affect compilation or the program's output, except for the aforementioned header error when -mNN was used on the wrong arch.

HTH
Mark

p.s. Open Cascade, it's December now. Feel free to give us our christmas present (6.4.0) a bit early! :)

Roman Lygin's picture

Hi Mark,

Thanks for your feedback. Yes, I do use -m32 on the 64 bit OS. Here is the sample excerpt from the build log:
g++ -c -DNDEBUG -DOCC_CONVERT_SIGNALS -DCSFDB -DHAVE_CONFIG_H -DNo_Exception -fPIC -DUSE_PTHREAD -O2 -Wall -Wno-non-virtual-dtor -m32 -I../inc -I../drv -I../../LicenseManager/inc -I../../QOLib/inc -I../../TestLib/inc -I/users/rlygin/dev/Dev/OCC/631/fix-lin32-gcc4-release/inc -I/users/rlygin/dev/DevTools/tbb/3.0.2/include -I/users/rlygin/dev/DevTools/Qt/4.7.1-lin32/include \
\
-o ../lin32/gcc4/obj/CadExTestLib/Release/CadExTestLib_CadEx.o ../src/CadExTestLib/CadExTestLib_CadEx.cxx

-bash-3.2$ uname -a
Linux 2.6.21-1.3194.fc7 #1 SMP Wed May 23 22:47:07 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

Everything compiles just fine and 95%+ test cases (out of 900+) run just fine.

The most suspicious reason is indeed some bug in gcc which might have been fixed in later versions. Your indication of 4.4.x reflect that.
Roman

Denis Barbier's picture

Hello Roman,

We applied your patch in OCE and slightly extended it. Here is our patch description, which is also contained within the patch file.

Subject: [PATCH] Work around GCC (< 4.3) bugs in gp_Mat and gp_XYZ

Report and patch provided by Roman Lygin at
http://www.opencascade.org/org/forum/thread_19603/

For instance, Standard_ConstructionError is raised
in gp_Trsf::SetValues, the reason is that gp_Mat::Subtract
does not work as expected (only the first row is
subtracted).

gp_Trsf T;
T.SetValues(
1.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0,
1.e-6, 1.e-6);

This patch has been extended to gp_Mat.cxx and gp_XYZ.cxx
to prevent an infinite loop in this code:
BRepPrimAPI_MakeBox b(gp_Pnt(-100,-60,-80),150,200,170);
b.Build();
TopoDS_Shape b_shape = b.Shape();
BRepOffsetAPI_MakeOffsetShape offsetA(b_shape,60,0.01);

It has also been extended to gp_Mat2d and gp_XY.

It is not clear what makes GCC buggy, but compilers should
be smart enough to compute multiple array indices when
compiling, so this change should not introduce any performance
problems. It would be better to go one step further and
remove macros, but it will make this patch much longer.