Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Zeroing Memory is Hard: VC++ 2015 arrays (randomascii.wordpress.com)
124 points by deverton on July 18, 2016 | hide | past | favorite | 55 comments


Apparently gcc and clang are not fooled by { 0 }, so that's good:

https://twitter.com/areuugee/status/754888773498867712


I'd wager they have an optimization pass that coalesces adjacent constant stores.


Among others. This is likely caught as zero initialisation already in the front-end, before even GIMPLE/LLVM JIT is generated.


At this point we must ask which Update of VS 2015 as MS just replaced the Optimizer with an SSA based one. https://blogs.msdn.microsoft.com/vcblog/2016/05/04/new-code-...


I'm the main developer of the new optimizer. It's a bit too much to say it replaces the old one, it's more of an addition. I was aware of this issue with initializing local arrays, it was on the TODO list - hopefully for the next VS.


I look forward to a fix, especially if the (I think) larger issue of large aggregates not being initialized efficiently is addressed. It feels like it is another instance of the same issue.

I hope the Python script is helpful for finding some of the odd variants that currently exist.


Why isn't the ssa optimizer mentioned in the Update 3 release notes? Is it enabled by default in Update 3?


Are you allowed to respond directly to the author in this case?


There's not much preventing any engineer at Microsoft from talking to developers. No lawyers need to be involved :-)


A good point, but already answered in the article:

> I recently noticed that it’s still an issue in VC++ 2015 Update 3


There's also the Clang/C2 toolchain, an option in the installer.


Clang/C2 doesn't use LLVM, it will use Microsoft's C2 backend.


That may be the case, but whatever the Clang frontend sends to C2 does produce better code than the Microsoft frontend. It probably already folds the stores at the frontend level, instead of leaving it to the backend to figure out.


Also note that you can't zero-initialize an array of objects that don't have a default constructor, and that if you just add an empty default constructor to shut the compiler up, the memory occupied by your array will not be initialized to zero or anything else.

Seems obvious and elementary enough, but it got me good the other day.


If you need to manually define a default constructor, and you want the behavior of default-initializing all members, declare the constructor as:

    T() = default;


If you want to feel some real pain ... take the following code (simplified for posting in this thread):

    template<typename T> struct Function {
      Function() = default;
      void* callback = nullptr;
    };

    struct CPU {
      Function<void ()> functions[65536];  //= {0} makes no difference
    };

    int main() {
      CPU cpu;
      return 0;
    }
There's a bug in GCC, at least versions 5.1 - 5.4, where the above code will take 30s-90s to compile (depends on your CPU speed), and eat 500MB+ of RAM in doing it. But the bug isn't in 4.9, and possibly newer GCC releases.

What causes it? "Function() = default;" of course. Change it to "Function() {}" and the code compiles in 100ms.

So because I can't tell people, "don't use GCC 5.x", or "don't ever use any of my class objects in a large array", I basically can't use "= default;" syntax on my constructors in any of my library code now :(


Explicitly defaulting the constructor behaves the same as an empty constructor: all members are default-initialized. The problem is usually that 'default-initializing' doesn't zero out primitive (POD) types, but leaves them uninitialized.


With msvc the /sdl flag can zero initialize class members to zero[1] (among other things). This runs before the constructor. Aside from the warnings that get turned into errors (you can disable the warning to bypass this behavior) it also does limited pointer sanitization and turns on strict_gs_check.

[1] https://msdn.microsoft.com/en-us/library/jj161081.aspx


The problem is usually that 'default-initializing' doesn't zero out primitive (POD) types, but leaves them uninitialized.

Exactly, that's what hosed me. Some double-precision floats ended up with garbage values and I couldn't see how it was happening.


Explicitly using the T(), T{}, T = {} syntax will value-initialize the object, which will zero-initialize trivially constructible types.


Ah, interesting. Apparently this is true even if some constructors are explicitly '= default' (for the T x{} and T x={} list/aggregate initialization cases only).

But I believe that rule doesn't apply for any objects where adding an '= default' constructor would be useful, because it requires that there be no user-defined constructors or private/protected members. Otherwise that syntax ultimately calls the default constructor, which won't do any zeroing.

See http://en.cppreference.com/w/cpp/language/aggregate_initiali...


Yes, forgot to mention this crucial detail.


If you ask the compiler to optimise for size you should really be getting a REP STOS for anything that needs more than the ~3 instructions which that takes.

(It's not the fastest way for small sizes --- it actually is for larger sizes, just like REP MOVS --- but the user asked for smaller, not faster code.)


Bruce, thanks for the kick! Six years is a long time to wait for a bug resolution.

It turns out that our compiler has a minimum size limit in a memset optimization. I’m sure the size limit was there for a Very Good Reason (TM) at one point in time, but we are investigating whether we can remove it. Step one is understanding why it was there in the first place.


= { 0 } is an "initialize almost anything" idiom in C; it's worth recognizing specially in a compiler.


Far worse is trying to grow a std::vector with a std::vector.resize() and subsequently trying to opt out of array intialization.


Most STL containers with the default allocator will always initialise memory (some, like std::array, don't iirc). That's a feature. If you don't want it, then you'll have to use your own allocator or container. Both bring changes throughout the codebase, unfortunately.


> Both bring changes throughout the codebase, unfortunately.

That's where auto often helps. Lets you change types as long as you don't break the interface.


Then you might need std::vector::reserve() instead.


But reserve() doesn't work if you are going to fread() into the buffer or otherwise fully initialize it. std::vector<char> basically requires you to double-initialize the buffer if you pass it to a function that just needs a raw pointer to fill, which can be an issue in some high-performance contexts.

I hit this in a game server where a few percent of CPU time (that was enough to count as low-hanging-fruit) was being spent on memset inside of resize().


Yes I am in proecisely this situation. It made me realize how political the c++ std library authorship is.


What was the solution? Using a custom allocator?


I am not OP, but you could get around it by having a default constructor that is empty. This might've not been possible if you need a different default constructor that initializes values.


In my case the fix was to not allocate so much space. The vector was always being resized to 64 KiB. After considering lots of other options I decided that the most pragmatic fix was just a smaller constant to the vector constructor. I don't remember why the other options weren't practical.


Yes you could use a custom allocator except that changes the type of the std::vector - making a large ripple in our code base.

I have not solved it yet. The best solution - don't use std::vector for anything critical.


A simple new int[] will work. It doesn't initialize the array (unlike the new int[]() version).

You can use it with std::unique_ptr<> to simplify memory management.


This is true but at least with the current standard you will want to define a custom deleter with your unique_ptr to implement "delete [] data" vs the default behavior of "delete data".


Using a custom allocator is the standard-blessed solution for this problem, although it can be inconvenient.


[flagged]


Please comment civilly and substantively or not at all.


Sorry, I was very bored and couldn't hold myself :)

Also, didn't know this kind of comment was not OK.


[flagged]


I recently stopped regularly visiting Slashdot, and started regularly visiting HN.

Besides regularly finding more appealing content, I have found that typically HN comments are less hateful, disparaging and isolating.

One (minor) version of this distinction is the constant debate on Slashdot about what should be published, and what should not. There, an assumed standard is expected for something to be considered "news for nerds" or whatever. For the most part I haven't seen that here. I've appreciated that fact.

Because fuck me, if you can't just pass by a link among thirty or so, give up and stop using the web. It's full of shit you don't think belongs there. Just stop shitting on things I want to read just because you don't want to read it too.

[Edited because upon re-reading, I really wished it had line breaks.]


I recently returned to HN, having read it a little a few years back. I remembered it as a low-volume, high-quality feed, where I was hesitant to speak, because all the other comments were so full of knowledge.

Now, HN seems to have become like any other place, full of noise like politics, social commentary, and just regular news. I guess the commenters are still knowledgeable about technological subjects, though.


The inevitability of every good site is that it will eventually grow, and with it, the diversity of the interests of the user-base. While I come to HN mainly for the technical news, I find it convenient that it also keeps me in the loop about a wider variety of topics I would surely miss otherwise. Your preference may obviously vary.


I browse here with a greasemonkey script to hide links from mainstream news sites, aggregators, etc. That cuts down to mostly just the tech content.

    var links = document.links;
    var boring = [
      'adage.com',
      'arstechnica',
      'bbc',
      'discovermagazine',
      'bloomberg',
      'bloombergview',
      'buzzfeed',
      'businessinsider',
      'californiasunday',
      'cnn',
      'csmonitor',
      'dailymail',
      'digg',
      'economist',
      'engadget',
      'esquire',
      'fastcompany',
      'forbes',
      'ft.com',
      'fortune',
      'fusion',
      'geekwire',
      'gizmodo',
      'harvard.edu',
      'huffingtonpost',
      'inc.com',
      'longreads',
      'medium',
      'mondaynote',
      'nature',
      'nautil.us',
      'newscientist',
      'newstatesman',
      'newyorker',
      'npr.org',
      'nybooks',
      'nymag',
      'nytimes',
      'pando',
      'psychologytoday',
      'qz.com',
      'reddit',
      'reuters',
      'sciencedaily',
      'scientificamerican',
      'slate',
      'techcrunch',
      'telegraph',
      'theatlantic',
      'theguardian',
      'thenation',
      'theparisreview',
      'theverge',
      'theregister',
      'time',
      'usatoday',
      'vancouversun',
      'vice',
      'vimeo',
      'vogue',
      'washingtonpost',
      'wired',
      'wsj',
      'yahoo',
      'youtube',
      'zdnet'
    ];
    matcher = new RegExp('\\b(' + (boring.join('|').replace(/\./g, '\\.')) + ')\\b');
    for (i = 0; i < links.length; i++) {
      hostname = links[i].hostname.replace(/^www\./, '');
      if (hostname.match(matcher)) {
        links[i].style.setProperty('display', 'none');
        links[i].parentNode.insertBefore(document.createTextNode('[removed]'), links[i]);
      }
    }
It's not that I wouldn't read those sites, they're just not what I'm interested in reading when I come here.


Cool! I've been wondering if there is a feed of only the good stuff. If HN itself doesn't provide one, maybe there should be a separate site that does that filtering. I was thinking it would be manually curated, but maybe a list like yours would work as well.

Maybe you should set up such a site. All it would have to do is link to the articles and to the corresponding HN discussions.


Most but not all politics and social commentary gets flagged pretty quickly. Some tech news also gets flagged. It would actually be interesting to have a page for flagged stories for transparency.

Getting responses from people actually involved in making tech decisions (like Gratilup here: https://news.ycombinator.com/item?id=12113571, and recently one of the original authors of Excel CSV export) is part of the appeal of this site to me.


you can enable showdead in the settings, flagged articles will show up as [dead] in /newest.


Thanks. Some show up as flagged, some show up as dead, but lots of live stories in between. I'm more interested in the ones where they had lots of votes in a short time but didn't make it past the flag process.


Then maybe http://hnrankings.info/ is what you're looking for? Front page articles which get "flagged off the front page" don't end up [flagged] or [dead], instead they drop in rank quickly. You can often see that quite well from the charts on that site.


That would probably be closer and I suspect they have the data to get what I want, which is more like a list of flagged stories to understand some of the inherent bias in the site.


Out of interest, where would you go for news that was like the old HN?

I used to really enjoy ArsTechnica but the quality of the journalism seems to have dropped slightly.


Perhaps https://lobste.rs/ might be an addition to your reading habits?


Is michaelochurch still getting upvoted there? I used to enjoy lobste.rs, but seeing yet another heavily upvoted rant about "open-plan agile mouthbreathing drones" or similar would ruin my whole morning.



Maybe I should have phrased it, "what am I missing?" Which was the actual purpose of the question.

Maybe I'm not missing much. Cool.

Best all.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: