> True, but it is cleaner to reorganize the code into several functions and use ...

Remnant44 · on June 28, 2020

If you enable /Ox, the codegen basically drops to what you would expect: the vector version drops down essentially identical code to the new/delete (modulo a memset to enforce the clear to zero condition)

It is a good illustration of why debug stl builds are such hot garbage though...

gmueckl · on June 28, 2020

It is a good illustration for why using the STL is not always a good idea: you can't blindly take the perf hit from that kind of overhead in a debuggable build of a game that you still want to run at reasonably interactive frame rates.

jfkebwjsbx · on June 28, 2020

False, most standard libraries out there allow you to configure whether you want extra checks or not.

gmueckl · on June 29, 2020

But at least in the case of MSVC that comfiguration option leads to binary incompatibilities that make it an all or nothing option for everything that gets linked together statically. And the checks are so heavy that the "all" option becomes unbearably slow quite quickly. If your project hits a reasonable size, you end up requiring some clever solutions.

jfkebwjsbx · on June 28, 2020

> Having to jump to another function definition which is inline is a bigger mental block than following a goto.

Local lambdas are ideal for this.

> Are you suggesting that the code should have done something like this?

Yes, but you can manage the array inside too.

> I mean, sure, it's a very minor cleanup. It changes like, two lines though

The point is that TempArray can be reused everywhere. This is a typical class that many projects use (stack if small, heap is bigger than threshold).

> Here's modern MSVC. Let's play "spot the difference"

The optimizer has been asked to leave everything as it is, so that is the expected result.

BTW, MSVC is not what you should be using if you want performance.

> Don't just say "should be"... test it yourself!

I always test codegen for all abstractions I use! So I agree.

Const-me · on June 29, 2020

> Local lambdas are ideal for this.

The problem with lambdas, you can't mark their operator() with __forceinline or __attribute__((always_inline)) attributes. For this reason, when writing high-performance manually vectorized code, lambdas are borderline useless.

> MSVC is not what you should be using if you want performance.

Security and compatibility has higher priority. gcc and clang don't deliver their C runtime libraries with windows updates. Also, debugging and crash diagnostic is much easier with MSVC.

It's same on Linux BTW, only with gcc.

jfkebwjsbx · on June 29, 2020

> when writing high-performance manually vectorized code

The point was not about manually vectorized loops in particular. Why is that a problem if you are manually doing it, though?

> Security and compatibility has higher priority.

In commercial games, not really.

As for "compatibility", I am not sure what you mean.

> gcc and clang don't deliver their C runtime libraries with windows updates.

AFAIK you can use Windows libraries just fine. No need for using a different libc.

> Also, debugging and crash diagnostic is much easier with MSVC.

AFAIK, Clang can produce debugging info that you can use with VS.

I don't work on the environment, but it is what I have read here.

Const-me · on June 29, 2020

> Why is that a problem if you are manually doing it, though?

Here’s couple examples: https://github.com/Const-me/DtsDecoder/blob/master/Utils/App... https://github.com/Const-me/SimdIntroArticle/blob/master/Flo... I would like to use lambdas instead of classes, but can’t, due to that defect of C++.

> In commercial games, not really.

In commercial games too. While they don’t care about security, they do care about compatibility and crash report diagnostics.

> As for "compatibility", I am not sure what you mean.

A windows update shouldn’t break stuff. A software or a game should run on a Windows released at least 10 years in the future.

> Clang can produce debugging info that you can use with VS.

According to marketing announcements. This page however https://clang.llvm.org/docs/MSVCCompatibility.html only says “mostly complete” and also “Work to teach lld about CodeView and PDBs is ongoing”. PDB support is not about VC compatibility, it’s about Windows compatibility really: the debugger engine is a component of OS, even of the OS kernel. WinDbg is merely a GUI client for that engine, visual studio is another one.

Overall, in my experience, the platform-default compilers cause the least amount of issues. On Windows this means msvc, on Linux gcc, on OSX clang. Technically gcc and clang are very portable. Practically, when you’re using a non-default toolset, you’re a minority of users of that toolset, e.g. the bugs you’ll find will be de-prioritized.

flipchart · on June 28, 2020

I'm still getting up to speed on C++ (& ASM) so forgive the ignorance - are the differences you're referring to all the cleanup code that the vector does in the destructor?

As a side note, you're not calling delete[] on the array code, but I suppose it makes little difference if you take the vector cleanup into account.

westmeal · on June 28, 2020

unfortunately MSVC produces hot garbage asm, clang and gcc produce pretty much the same asm output using your example.