Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why OCaml? [video] (janestreet.com)
134 points by arto on Jan 26, 2016 | hide | past | favorite | 76 comments


My take aways from the talk:

1. Financial transactions are adversarial and have heavy tails. Testing is not sufficient for Universal Guarantees, type system helps here.

2. Whole class of trivial bugs are not there because of Option (Some, None) and SystemF type system

3. Type system should be part of design process (which happens in OCaml)

4.Precise and Expressive type systems.

5. Right tool for right job, but with OCaml is in sweet-spot to write huge range of programs * little scripts * mini langs for configs * big trading system

6. Teaching/Learning: * a good percentage of traders with a month long bootcamp were able to learn with a good level of competence * MIT and Harvard students were able to achieve competence in 3-4 week internships.

7. The "python" paradox, esoteric language development helps attract high-quality/better talent

8. Tools and libraries OCaml would not be great if you have to reinvent lot of libraries ala. web development, but has good reasonable ecosystem for customized solutions (like trading systems etc.)


I'm not sure I understand the adversarial part or why that should matter. He mentioned it being a zero-sum game, with some having to lose for others to win. I don't get why that's relevant to the programming language choice.

I think what he should have said is that financial transactions are important, and that it's expensive to screw up. This is of course as true or more for, say, NASA's flight software (which is not problem that's not adversarial in the way he means) as it is for financial trading software.


The difference is that if NASA screws up, there's a certain percentage chance that the bug will be exposed, and cause a problem. That percentage usually stays constant over the lifetime of the system, regardless of how many times the bug has already bitten them. In an adversarial situation, there's a percentage chance that the bug will be exposed, and after that a 100% chance that adversaries will exploit it.


Pick up "Capital Markets for Quantitative Professionals" - a great resource - and get a taste of this as early as the preface! The author writes about a bug he introduced in a program making markets in treasuries. It cost his firm $30k every trade, and the errant trades piled up as other participants recognized his mistake.


I am learning Haskell but it is very s-l-o-o-o-w to get to a point where I feel I am actually able to write useful software (probably my fault in fairness!). I have looked over the fence at OCaml and it seems it hits a sweet spot between practicality and rigor.

Can anyone advise on whether learning OCaml is worth it when compared to Haskell, considering I am on my way with Haskell, if only slowly? The lack of tooling on Windows (I write cross platform desktop software) and the maturity of the web frameworks (compared to Haskell) has put be off learning OCaml in the past...


Learn F#. It's basically OCaml for .NET (there are some differences between them, but they're very similar). You get great tooling, a huge library of functions (F# supports all .NET libraries), a welcoming community, and the language itself is very well thought out.

If you decide you'd like to learn OCaml at a later date, you'll find it easy if you've already learnt F#.

With regards to web frameworks, you've got some good options, ranging from heavyweight frameworks like WebSharper, to lightweight frameworks like Suave.

Much of F# is cross platform too. It's been open source for a long time, and runs fine on Mono.

If you'd like some extra information about F#, I'd recommend the website 'F# for fun and profit':

http://fsharpforfunandprofit.com/


I'm surprised by how many people recommend f# over ocaml. Seams to me that ocaml has more features, better tools and support (but on windows), and more creative community, though. Am I missing something obvious?


It's not as one sided as you make out. OCaml definitely has some advantages (such as native code compilation and support for some extra language features such as functors), but F# has it's own advantages too, such as:

* A larger choice of available libraries.

* Multicore support (OCaml is getting this soon, but it's currently single core).

* A more straightforward standard library (OCaml has three different competing standard libraries (the default one, Batteries, Core), and as far as I can see the F# one is better than all of them in terms of consistency (for example, can expect the same group of Iter, Map, Filter, etc... on all grouped data structures (Arrays, Lists, Sequences, Maps, etc...).).

* Visual Studio is a damn good IDE, and makes coding in F# even better. Perhaps something similar exists for OCaml, but I'm not aware of anything of the calibre of VS. Worth noting in both cases you don't need an IDE to be effective.

I'm not dismissing OCaml, I'd like to learn it one day too, but I hope I've helped explain why some people would want to learn F# first.


F# has the distinct advantage of native interoperation with anything in .NET-land.


Learning Haskell is rewarding in long term, since it deepens your understanding of things.

I learnt both Haskell and OCaml at the same time since their core is essentially the same. But in real world scenarios the way you think about problems is different in both languages. Haskell tends to be more abstract and favor mathematical reasoning, while OCaml is pragmatic but still very expressive.

TL;DR: It depends on your needs, if you want to learn the language and start building something quickly, OCaml is definitely a better choice. You can always come back later to Haskell, and you'll be surprised to see how similar they are.


You might want to start with: http://blog.ezyang.com/2010/10/ocaml-for-haskellers/ and explore the languages for what they are yourself.

I haven't tried Haskell apart from the very basic stuff, but if you have tried your hand at it for too long, and aren't able to make any headway, then you might as well invest time in learning OCaml/F# or Erlang or Idris or Rust or Elm instead. I hear Rust and Elm are pretty good too.

OCaml doesn't do mutli-core, I think [0], whilst Haskell has great support parallelism, concurrency [1]. Erlang, of course, is absolutely amazing at that too.

I primarily learn languages to expand my understanding of various programming constructs. It feels liberating to know that things are much easier in certain languages due to presence of certain constructs absent in other languages. For instance, OCaml is insanely good for compiler design and/or static analysis primarily due to 'variants' and the type system; Go (channels)/Erlang (actor-based programming, hot-plug code)/JVM-based languages (Clojure/Scala et al running on vastly optimized vm) for building massively concurrent systems; Ruby for being expressive, powerful, and simply awesome; JavaScript for... well... getting a taste of event-driven programming.

The challenges faced when using some of these languages are fascinating and at some point, it all comes together, as you try searching for a language in another language (like trying to mimic functional/reactive paradigms in Java 7 or below when faced with problems in handling "streaming data" for example).

In my personal experience, I have found strongly-typed languages to be easier to master and adopt in production than dynamically-typed languages.

----

[0] https://news.ycombinator.com/item?id=9582980

[1] https://ghcmutterings.wordpress.com/2009/10/06/parallelism-c...


> OCaml doesn't do mutli-core, I think

Yes and no. Currently, OCaml has only cooperative or GIL-restricted multi-threading support, but multi-process/distributed systems work just fine (largely due to its built-in near universal serialization mechanism, which handles even closures).

It's not perfect, but it's still functional.


Can you expand on this point? When I tried OCaml a little time ago, it was a pain even to have a polymorphic print function. I would be surprised if there was a serialization mechanism that produces a binary form that cannot be adapted to produce a textual form


Have a look at the Marshal module. Marshal can be polymorphic because it has compiler support and can rely on runtime internals (similar to how comparison is polymorphic or the Printf module can work). It works for pretty much everything where the representation is known to OCaml (the exceptions are things like extern C types wrapped in OCaml types).

Example:

  let f x = 2 * x + 1
  let s = Marshal.to_string f [ Marshal.Closures ]
  let g: (int -> int) = Marshal.from_string s 0
  let () = Printf.printf "%d\n" (g 2)
Note: this may not work in the REPL and you may have to put it in a file and compile it.

Note also that serialization need not be polymorphic. See e.g. the s-expression mechanism [2], which relies on metaprogramming facilities.

[1] http://caml.inria.fr/pub/docs/manual-ocaml/libref/Marshal.ht...

[2] https://realworldocaml.org/v1/en/html/data-serialization-wit...


Thank you. s-exps are a pain, because you have to manually specify the function sexp_of_t for any type t that you want to support


No, that's what the ppx rewriter is for (the metaprogramming I mentioned). You write type foo = ... with sexp and those functions will be generated automatically for you. Mind you, the whole Jane Street Core machinery is still a bit heavyweight for my taste, but that particular concern shouldn't be an issue. For example, in utop:

  utop # #require "core";;
  utop # #require "core.syntax";;
  utop # open Core.Std;;

  utop # type foo = int * string with sexp;;
  type foo = int * string
  val foo_of_sexp : Sexp.t -> foo = <fun>
  val sexp_of_foo : foo -> Sexp.t = <fun>                                         

  utop # sexp_of_foo (1, "foo");;
  - : Sexp.t = (1 foo)


> OCaml/F# or Erlang or Idris or Rust or Elm

I think PureScript deserves to be in this list as well :)


OCaml has most of the "good and easy parts" of Haskell, without the ones that prevent you from using your old habits:

* no lazy evaluation => your usual sense of what's executed when isn't challenged

* imperative features included, albeit often syntactically ugly => You're not stuck because you need to compose 3 monad transformers together, you can't make sense out of the resulting types, and you have absolutely no idea of how the result might be executed anymore. Empirically, the imperative syntax is just ugly enough to make you slightly ashamed of going imperative, and spend 1/4 hour figuring out whether there's a clean and reasonably easy alternative. I don't known whether it's been done on purpose but it's well done :-)

What I miss the most from Haskell in OCaml are type classes. I bet they would have been included, if OCaml had been invented after Haskell.

From Haskell you keep a type system which helps you thinking and finds many error classes, encouragement to think functionally, godd terseness/readability compromise. If you have the time, I'd advise you to learn Haskell, in order to stretch your mind and become an excellent OCaml developer, the way learning Latin makes you a better French or Italian writer.

Re: lack of tooling: I'd be surprised if F wasn't properly supported, and if you're targeting Windows that's probably your best bet. More fundamentally, your truly valuable tooling isn't the IDE, it's the type system. I need a fancy IDE for Python or Java, not for OCaml.


Typeclasses play poorly with Modules. One is the picture of a global phenomenon and the other the essence of forsaking global phenomena.

There's current work under the name "modular implicits" to solve this problem, though.


But there are a few languages that have both, such as Ur and Coq, and possibly others. Also, I know that there's work on a more powerful module system for Haskell called Backpack (although, I'm not really familiar with it). I agree that there are certain conceptual incompatibilities between the two as you mention, but they're both so useful that it seems like there must be good ways to bring them together, or at least make both of them available.


For those interested in modular implicits, see http://www.lpw25.net/ml2014.pdf


If you are looking for web frameworks, F# (a cousin of OCaml) is a better bet for you compared to OCaml and fits well with your current development environment.

F# has Suave.io which is more idiomatic along with asp.net mvc where you can fit F# into.

Freya for OWIN - RESTful development looks very nice too.

Canopy - for Web UI automation (DSL on selenium) is also an awesome tool.


I attended the talk Yaron gave so I have a little context - OCaml is definitely, definitely worth it, and I feel like I have a decent grasp of the language after only 3 weeks of working with it, to the point that I can make useful, large programs.

I've learnt Haskell in the past and it was slow primarily because Learn You a Haskell is I think poorly written.

Read Yaron's book Real World OCaml. It's great.


I was also shocked how quickly I was able to write code in OCaml after having no prior experience with static typing. A couple weeks in and I was quite comfortable.


Definitely. I learned OCaml from the online tutorial and that gave me a pretty good starting point in a few weeks. Compared to the Haskell tutorial on Haskell's website, it focuses a lot more on building real programs versus the sort of REPL-based approach given on the Haskell site (which basically doesn't teach anything more than short one-liners due to its limitations). And even the most popular Haskell book (LYAH) is a worse resource IMO.


I personally found Ocaml to be perfect for learning typed (Hindly Milner) functional programming.

Haskell with its strong syntactic support was good to learn about Monads and how to work with enforced purity.

From there, I went back to Ocaml, using both Monadic IO and Ocaml's better module/functor system.


I did the exact same thing.


I also have been going through a multi-year Haskell learning curve. One thing that helped make me more productive was writing the pure code first (use mock data) and after the pure code works well, then worry about writing the main program, dealing with impure code for file IO, web access, etc.

I have heard great things about OCaml but it seems to make more sense to master one strongly typed language first, and since you already are partially up to speed on Haskell, why not stick with it for now?


This is very similar to how I've been writing my Idris code. I start from the sort of core operations of a program and then slowly spread out from there, before finally incorporating IO at the end.


You are right that it can take a long time to learn Haskell. I've been learning it for probably six years now. At this point I can browse through most libraries on Hackage, read the types, and usually figure out how to use something. I'm still picking up bits and pieces of category theory to help use the more academic-minded libraries, but invariably I find that once I grasp a concept it seems like such a simple and elegant idea. It's just that the implementation and use take getting used to.

Nowadays I think it's easier to do real-world stuff thanks to the amazing work by Michael Snoyman et al. on Stack, the curated package set and build tool, and other projects like Chris Allen and Julie Moronuki's haskellbook.com.


AFAIK F# is supposed to be an ML that is more like OCaml, but on the .NET runtime. Give it a look, it may be useful for you. Also, IIRC, the compiler for it went OSS recently. But I've not tried the language myself, so this is hearsay.


idk about OCaml but if you find learning Haskell slow I highly recommend learning Erlang. It's similar to Haskell in some way but 10x simpler and highly practical. Probably the most practical language/framework (OTP) I've ever come across outside of maybe Ruby/Rails.

By 'practical' I mean in terms of starting out new -> launching a production-worthy business application in an efficient time window.

My problem with Haskell is that I spend all of my time trying to build perfect software instead of productively pumping out software. This is why I prefer Erlang, despite the fact Haskell is a superior language in many ways.


I love Erlang and OCaml both, but I am hesitant to describe it as "highly practical" without major caveats. If you're writing clusters of non-HTTP network servers, sure; nothing beats Erlang/OTP. But outside that niche it's difficult to justify.


Great talk - for me the module system, and functors in particular, are one of the great strengths of OCaml.

I'm hoping increases usage by Jane Street, Facebook and Bloomberg, along with the Unikernels/Docker tie-up will lead to increased uptake and visibility. I personally find it more suitable to systems space than Go, but with far more features to help build correct code.

(shameless plug - we're using OCaml for our microservices platform in London and are now hiring - https://www.stackhut.com/#/careers)


For people curious about OCaml and wanting to actually learn the language, aka build something useful, then come to the next OCaml meetup at the Climate Corporation in San Francisco. Its a workshop from idea to opam publishable package. We'll build a command line tool to do bread and butter coding, aka HTTP requests and Json manipulation. You'll have reusable code and a starting point so you don't waste time with building code and actually spend time writing code.

http://www.meetup.com/sv-ocaml/events/228367572/

As incentive I will be giving away Enter the Monad t-shirts courtesy of Jane Street, thank you yminsky!


One thing I love about OCaml is named parameters. I am an explicit kind of guy and ironically even with all the type inferencing OCaml has it is damn explicit of a language (although I sort of prefer ad-hoc poly but oh well). It seems like a pretty trivial thing to have named parameters but there are so many languages that do not have this feature (or have this feature and it breaks something (scala->java)).


"Raise your hands if you know who John Carmack is." (long pause) "Raise your hands if you've ever played Quake, or Rage, or Doom..." (pause) "...or if you've ever heard of them." (pause, blank stares) "Oh! Well, I guess it's been awhile..."


Related: Jeff Meyerson's audio interview with Yaron Minsky.

http://softwareengineeringdaily.com/?s=janestreet

It goes into a lot more of the nuts and bolts of JaneStreet and puts the use of OCaml in a more technical context.


Also from Yaron Minsky, in the same vein - http://queue.acm.org/detail.cfm?id=2038036


Thanks, great article. I am learning Haskell, but this article on OCaml is generally useful.


But still no Windows support for OPAM, right? sigh

That's one of the things that Rust absolutely got right, helping its adoption tremendously.


It's being very actively worked on. No solid ETA yet but I would expect the coming version or the one after to work out of the box.


You could turn that sigh into a PR. You want it, then make it happen, this is open source after all.

Here is something that might help in the mean time, https://www.typerex.org/ocpwin.html.


That's a stupid reply. I don't question the right of the Ocaml/OPIE developers to ignore platforms they don't like and use. I question the wisdom in doing so.

Asking people who aren't invested in the slightest in your language so far to start hacking on your tooling before they can try and learn your language is an interesting onboarding strategy.


No one has a vested interest in your interests other than you.


Counterexample: the Rust team.

I'll leave it at that. You're simply not reasonable.


That is a fairly bad counter example, considering that Rust is primarily backed by Mozilla with the view to use in Firefox which pretty much requires Windows support to be worthwhile.


IMHO the biggest factors when evaluating a new programming language boils down to "modules" i.e the methods for creating reusuable units of code, and "package management" i.e. how easy is it require previously built units of code and combine them to create compound units. I wish he had talked more about those aspects.

For me, any language that misses these two is not worth the effort unless you have really good reasons (i.e. every clock cycle counts in an embedded program).

Anyway, to that end, what is OCaml's packaging ecosystem like?


Im my experience OCaml has the best modularity abstractions out there. OCaml modules are unique since they allow you to work with blocks of code (and types) on a higher level.

Composing and extending modules, passing modules as parameters to other modules and this way adapting the code to your need.

And the nice thing is that all the modules are evaluated at compile-time. Which means you don't have to sacrifice performance for modularity and still get a very flexible framework for Metaprogramming[1].

[1]: https://ocaml.org/learn/tutorials/modules.html#Functors


It sounds like you're already aware of this from your question and others have already beat me to the punch, but:

OCaml modules are best of breed. Better than anything else. Nothing else even comes close.

They're weird and different from modules you'll see in other languages. Then after you get used to them you'll find those other modules are at best neutered and most likely severely broken versions of what you've come to know and love from OCaml (or any of the true MLs).

OCaml modules arise from a certain treatment of "existential types" which are, in a lot of ways, the very foundation of type abstraction. If you don't care about types practically, then you can care about them spiritually as a means of describing program interfaces. Existential types are maybe the very foundation of abstract program interfaces.

So, OCaml is at least 50% good. Opam is kind of nice, too.


These days, it's pretty good. opam is the tool of choice:

http://opam.ocaml.org

I've found it to be surprisingly reliable and easy to use.


I'm not very familiar with the actual library distribution stuff in OCaml, but linguistically, the ML style of modules and interfaces is really nice, and sometimes even Haskell programmers envy it. Especially the built-in distinction between an interface and an implementation, which lets you depend on an abstract API rather than on a concrete library.


Reading about side effects in py I found: https://github.com/python-effect/effect


Does making the py recursive miss the point because py isnt guranteed to be immutable?

  def sum(list):
      if list:
          return list.pop() + sum(list)
      return 0


Python lacks the tail recursion optimizations common in functional languages like lisp. Seems like the language designers favor iteration instead of recursion, because recursion is limited to a depth of ~1000 on the default implementation (see sys.getrecursionlimit).

It is always possible to convert recursive into iterative, and most of the time doing so is trivial. I recommend doing this unless your algorithm is O(log n) or better.


Even if Python did tail-recursion optimization, this implementation isn't tail-recursive so it would fail in the same way. This version is:

    def sum(li, s=0):
      if len(li) == 0:
        return s
      else:
        return sum(li[1:], li[0] + s)


The functional bits of python were a very early addition by a colleague of GvR, he never wanted lambdas, map and such to be in python.


For anyone curious, @ 15 min https://www.youtube.com/watch?v=r75X4Vn_E9k discusses a recursive sum() and why getrecursionlimit() is there.


Excelent resource! I could not put it better than Gavin Bong in this talk "Functional Programming in Python for the Uninitiated". Thanks!


Is that true even with "generator expression" (Python's lazy evaluation)?


Generators take the guts of a loop and make them first class. In order to evaluate that generator incrementally, you still need to provide a loop or a recursive function.


The OCaml version presented in the video isn't tail-recursive, either.


aaah. The recursion limit. So sum() fails on large inputs.


I'm not sure if that's something that he wrote or you wrote, but a `sum` function which has a side-effect of emptying its input list seems like a terrible thing. I hope you wouldn't use that in production.


Why's that bad? If there's nothing else the input is needed for, why keep it around?


When someone calls that, they wouldn't expect the input to be consumed.

  >>> list = [1,2,3]
  >>> sum(list)/len(list)
  >>> Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  ZeroDivisionError: integer division or modulo by zero


Thanks. I keep forgetting py isnt working on a copy of the var. copy.deepcopy() kinda fixes it, but then memory use is doubled, and the recursion limit kills it all anyway. I better go read my py code...


It's not bad necessarily to do it that way inline in another function, because there's no way to use it elsewhere without realizing that it has the side effect of emptying the list. However since this is recursive that's not realistically how it would be used.

It wouldn't be bad if it were obvious what it did, which it isn't with the name "sum", but would be with the name 'sum_and_empty". For good measure, maybe "sum_in_linear_space_and_empty" is even better.

It's also less efficient (although in the context of a recursive sum definition that's not too relevant). Essentially what you're doing here is using the length of the list as the iteration variable — the `i' in `for (i = n-1; i >= 0; i--)'. But `pop()' probably does more than decrement the length field, I imagine it also does some memory management bookkeeping at least.

I also take issue with how `list' is mutated and accessed in the same expression. I don't really use python, but I think I recall reading that it in fact guarantees left-to-right evaluation of such expressions. Even still, it's confusing and makes programmers, especially those who write C/C++ where this is undefined behavior, uneasy.


The only one who knows if the input is needed anymore is the caller, not all callers will agree, so it's bad form to let the called function decide automatically unless it's very clearly marked as a mutating, list-destroying function.


Your function is fine but if you didn't want it to affect the original list you could do something like:

    def sum(lst):
       if lst:
           return lst[0] + sum(lst[1:])
       return 0


Without knowing about the recursion limit that's what I should have written if I understood .pop() was mutating the input var's global state. It's a nice accident though.


because some idiot like me might join the team and call "sum" (or some parent that calls sum) thinking that everything is fine.


You can write functional code in any language, just as you've shown. But your mistake perfectly elucidates the point. When you call a function in a functional language, you know the variable you pass as the parameter won't be mutated. When you call a function in a non-functional language, you only hope it won't.

And even when the type system allows you to know the input variable won't be mutated (say, passing by copy in C++), you don't know what other kinds of side-effects the function may be having.


Fortran, usually not regarded a functional language, has `pure` keyword to declare that a function (or procedure) has no side effects, and `intent(in)` to declare that a function argument will not be mutated.


is multi core ocaml still being considered to merge into the main ocaml distribuition





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: