Saturday, March 30, 2019

What Capt. Kirk's Internet is saying about Big Data

 A data scientist is not a cog in the machine. And there is more to the profession than pushing buttons. Science is part art, and asking the right questions is not a talent that comes easily.

My friend George Lawton has been thinking about road traffic and AI and human cognition, even human empathy. Having watched or heard about at least a half dozen instances of road rage this week, I think he is on to something.  What would TensorFlow do? WWTFD?

The cocktail approach has gained maturity in various fields. It's coming to data science.

Thursday, March 28, 2019

Julia language

Haven’t been to an MIT open lecture for a while. Recently took in one that concerned Julia, an open source programming language with interesting characteristics.

The session was led by MIT math prof Alan Edelman. He said the key to the language was its support of composable abstractions.

An MIT News report has it that:“Julia allows researchers to write high-level code in an intuitive syntax and produce code with the speed of production programming languages,” according to a statement from the selection committee. “Julia has been widely adopted by the scientific computing community for application areas that include astronomy, economics, deep learning, energy optimization, and medicine. In particular, the Federal Aviation Administration has chosen Julia as the language for the next-generation airborne collision avoidance system.”

The language is built to work easily with other programming language, so you can sew things together. I take it that Julia owes debts to Jupyter, Python and R, and like them find use in science. Prof Edelman contrasted Julia's speed with that of Python.

In Deep Neurals as people work through gradients its like linear algebra as a scalar neural net problem these days, Edelman said. Julia can do this quickly, (it's good as a 'backprop')he indicated. He also saw it as useful in addressing the niggling problem of reporducibility in scientific experiments using computing.

Here are some bullet points on the language from Wikipedia:

*Multiple dispatch: providing ability to define function behavior across many combinations of argument types
*Dynamic type system: types for documentation, optimization, and dispatch
*Good performance, approaching that of statically-typed languages like C
*A built-in package manager
*Lisp-like macros and other metaprogramming facilities
*Call Python functions: use the PyCall package[a]
*Call C functions directly: no wrappers or special APIs

Also from Wikipedia: Julia has attracted some high-profile clients, from investment manager BlackRock, which uses it for time-series analytics, to the British insurer Aviva, which uses it for risk calculations. In 2015, the Federal Reserve Bank of New York used Julia to make models of the US economy, noting that the language made model estimation "about 10 times faster" than its previous MATLAB implementation. 

Edelman more or less touts superior values for Julia versus NumPy. Google has worked with it and TPUs and machine learning [see Automatic Full Compilation of Julia Programs and ML Models to Cloud TPUs".

It's magic he says is multiple dispatch. Python does single dispatch on the first argument. That's one of the biggies. (Someone in the audience sees a predecessor in Forth. There is nothing new in computer science, Edelman conjects. Early people didnt see its applications to use cases like we see here, he infers. )Also important is pipe stability. What are composable abstractions? I don’t know. J. Vaughan

Related
http://calendar.mit.edu/event/julia_programming_-_humans_compose_when_software_does#.XJ1julVKiM9
http://news.mit.edu/2018/julia-language-co-creators-win-james-wilkinson-prize-numerical-software-1226
https://en.wikipedia.org/wiki/Julia_(programming_language)
https://www.nature.com/articles/s41562-016-0021