Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Oh of course they tried to build a high-level CPU too. Still a terrible idea decades later:

I'll bite. A high-level CPU doesn't nessecarily have to add stuff, sometimes a high-level CPU does less because the tight integration with OS and language does more.

If memory safety is enforced by your compiler, e.g by your OS only executing WASM, you can get rid of the MMU. [1]

If you get rid of branch prediction and prefetching logic you get more space for more cores. If you hook up memory in parallel to those cores you get higher effective memory throughput and executed instructions per second.

If you got rid of the mmu you can also use that space for accelerating database indexes. CPUs already contain trie walkers for their page tables, you might as well use that for looking up data in indexes directly.

In the end highly parallel architectures won. It's just that we don't run logic programs in parallel but fuzzy neural stuff via array programming language DSLs.

1: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...



Replacing hardware memory safety with WASM is a great idea, but FWIW you can get most of the benefits by just using hugepages. Eg, clang spends 7% of its time on TLB misses and you can get a free 5% speedup using hugectl: https://easyperf.net/blog/2022/09/01/Utilizing-Huge-Pages-Fo...

The practical solution would be to replace the MMU with a tiny one that's backwards compatible and extremely cheap in silicon area and latency. Old binaries would limp along due to translation overhead while new software would use only a couple of pages across the whole system since memory protection is done in software when the bytecode is compiled and you never need to swap to disk. New binaries would run >>7% faster. I wouldn't call this a high-level CPU tho. Most high-level CPU ideas are really stupid ones like tagging every word with a type ID that must be checked by hardware while wasting cache on something the compiler could have done.

> If you get rid of branch prediction and prefetching logic you get more space for more cores. If you hook up memory in parallel to those cores you get higher effective memory throughput and executed instructions per second.

Sounds like a GPU. Compiling normal code to run on chips like that is still an open problem and I doubt it will be solved within the next decade. We'd need AGI to rewrite code the same way humans do

> CPUs already contain trie walkers for their page tables, you might as well use that for looking up data in indexes directly.

I had the same idea and I feel like it should be possible with today's CPUs, tho I'm not sure how to implement it. Eg convert IP prefixes to page table entries and use the TLB to decide where to route packets the same way routers use TCAM in ASICs




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: