Data Data Data: March 2016

Tuesday, March 22, 2016

You and the night and the machine intelligence

Sunday, March 20, 2016

On the eve of the White House Water Summit

From On the Waterfront

References to The Manhattan Project ( for example, "We need a new Manhattan Project" to address fill in the blank) are overdone. But we need something on the order of something to deal with water. California knows what it is like to live with this life blood threatened – Israel too. It is good cast attention on it – and that might happen to some extent this week as The White House Water Summit takes place.

One of the issues that must be addressed is data about water. It is not as good as data on oil, or stocks, but it should be. In the New York Times op-ed column Charles Fishman writes about water and data, and how weak efforts are to categorize, detail and report on water use.

Imagine if NOAA only reported on weather every fifth day. That is analogous to the water reports of the U.S. government, according to Fishman, who says, where water is concerned, we spend five years rolling up a report on a single year. The biggest problem, says Fishman, is water's invisibility, here and globally.

He focused on the fact that water census is done only every five years - that gives us only 20% view of the total water experience. He points to Flint, Toledo, the Colorado basin as recent water crises and notes that adequately monitoring the water doesn't assure results, but that inadequately monitoring the water is criminal what with so much monitoring of Wall Street, Twitter Tweets or auto traffic. Any call for more monitoring of course is up against today's version of the 1800's Know-Nothing movement.

Fishman tells us that good information does three things: 1- it creates demand for more information; 2- it changes people's behavior; and, 3- it ignites innovation.

But what is next? My little look-see into this area uncovered an overabundance of data formats for representing data. It seems a first step for water data improvements might come with the application of modern big data technology to the problem of multiple formats.

Remember the Beachboys: "In an ocean or/in a glass/ cool water is such a gas."

Related
http://www.nytimes.com/2016/03/17/opinion/the-water-data-drought.html
http://www.nytimes.com/2016/03/22/opinion/our-water-systemwhat-a-waste.html

Sunday, March 13, 2016

Four tales of data

Scenario - Four young students on Spring Break go to Honduras in search of the roots of big data. Each comes back with a story. Together they tell of a struggle entwined. Look at data anew through the eyes of four groovy undergrads. Not yet rated.

Oceans of data - Hundreds of meters below the surface of the ocean, Laura Robinson probes the steep slopes of massive undersea mountains. She's on the hunt for thousand-year-old corals that she can test in a nuclear reactor to discover how the ocean changes over time. Big data is her co-pilot

https://www.ted.com/talks/laura_robinson_the_secrets_i_find_on_the_mysterious_ocean_floor

Lord, bless your data - Thomas Bayes writer F.T. Flam [I am not making this up] says set out to calculate the probability of God's existence. This was back in the 19th Century in jolly old England. The math was difficult and really beyond the ken of calculation of the time - until the recent profusion of clustered computer power came around the corner in the early 2000s.

https://en.wikipedia.org/wiki/Thomas_Bayes

Autopilot and the roots of cybernation - Elmer Sperry's son Elmer’s son, Lawrence Burst Sperry, nicknamed Gyro, was best known for inventing the autopilot utilizing the concepts developed by his father for the gyroscope. Alas he was lost off the Channel when only 31.

https://www.pinterest.com/pin/568649890431580951/

I was just trying to help people. The story by Veronique Greenwood tells us she wrote of her experience in a letter to the New England Journal of Medicine, and was subsequently warned by her bosses not to do that kind of query again. Assumedly HIPPA privacy concerns are involved – so get out the anonymizer and gun it, right?!

http://itsthedatatalking.blogspot.com/2014/10/calling-dr-data-dr-null-dr-data-for.html

-----

Data does baseball

Brought to you by MBA@Syracuse: Tools of Baseball Analytics

Data, Humans on the Side

Good things - The recent PBS show The Human Side of Data has much to recommend it. As someone who labors in the big data vineyard as a reporter and commentator, I appreciate its succinct high level view on one of the defining memes of now. I had the chance to speak with my colleague Ed Burns on the topic for the Talking Data Podcast, and thought I’d add to the discussion here.

There were some beautiful animated pictures of data crossing the world – be it air traffic in a normal day or one tweet spreading. A theme was that humans had to create narratives around the data (per Jack Dorsey), and to follow the trail from the data point to the actual real-world event (Jer Thorpe). What makes a culture collectively change its view of data? - one participant asks. What is the context of the data? – several query.

Cause for pause things - And that takes us to an underlying issue with the show.. which is that there is this unspoken progressive notion that we are getting better – that a Ivy League youngster who studied statistics and grew up with the Web for pablum, soda, and bread can do better than the dimwits that went before. It could be true. But correlation is not causation.To phrase a coin. -Jack Vaughan

Wednesday, March 9, 2016

On the machine learning curve

There was an article in the New York Times today I thought I might mention. "Taking Baby Steps toward Software that Reasons like Humana" (below) by John Markoff is a an articulate look at what I call machine learning. .

The story considers what is going on these days as a re-vitalization of artificial intelligence, which bloomed in the 1980s and then faded from headlines, and I agree. I think the story conveys, in a way, that there are some similarities.

The story looks doesn’t use the term 'machine learning' – tho it does mention pattern recognition (somewhat synonymous), deep learning and deep neural nets .. which are fairly similar. What I think that emphasizes is that today's 'machine learning' is basically a new take on neural networks.

And as such machine learning faces hurdles -because flaws that stymied AI, still remain to be addressed. As Markoff writes "generalized systems that approach human levels of understanding and reasoning have not been developed."

What he doesn’t say is that the people that sell these things today tend to gloss over that, same as their counterparts 'back in the day.' That is not to criticize this particular work, which necessarily has a limited objective.

The story doesn’t use the term 'cognitive computing' either. But it talks about things – Q&A systems, natural speech processing - that combine with 'deep learning' to create cognitive computing.

Taking Baby Steps Toward Software That Reasons Like Humans

By JOHN MARKOFF MARCH 6, 2016

Richard Socher appeared nervous as he waited for his artificial intelligence program to answer a simple question: “Is the tennis player wearing a cap?” The word “processing” lingered on his laptop’s display for what felt like an eternity. Then the program offered the answer a human might have given instantly: “Yes.” ....