Thursday, June 30, 2016

The nature of field data gathering has changed

The nature of field data gathering has changed, as mobile devices and notepad computers find wider circulation. Surveys that once went through arduous roll-up processes are now gathered and digitized quickly. Now, a new stage of innovation is underway, as back-end systems enable users to employ field data for near-real-time decision making. An example in the geographic information system (GIS) space is ESRI's Survey123 for ArcGIS, which was formally introduced at ESRI's annual user conference, held this week in San Diego. To read the rest of the story.


See also Be there when the GIS plays Hadoop

Tuesday, May 10, 2016

Less Moore's Law in Store

Quantum computers wait in wings as
Moore's Law slows to a crawl.
Source: IBM
Fair to say our sister blog turned into "The Saturday Evening Review of John Markhoff" a long time ago. Well, at Amazing Techno Futures, the news feeds are good - and we could do worse than to track John Markhoff, who has been covering high tech at NYTImes for lo these many years. And I will not turn into a pumpkin if I hijack my own hijack of John.

For your consideration: His May 5 article on Moore's Law. He rightly points out this at inception was more an observation than a law, but Intel's Gordon Moore's 1965 eureka that the number of components that could be etched onto the surface of a silicon wafer was doubling at regular intervals stood the test of what today passes for time.

The news hook is a decision by the Semiconductor Industry Assn's to discontinue its Technology Roadmap for Semiconductors, based I take it on the closing of the Moore's Law era. IEEE will take up where this leaves off, with a forecasting roadmap [system] that tracks a wider swath of technology. Markhoff suggests that Intel hasn't entirely accepted the end of this line.

Possible parts of that swath, according to Markhoff, are quantum computing and  graphene.  The heat of the chips has been the major culprit blocking Moore's Law further run. Cost may be the next bugaboo. So far, parallelism has been the answer.

Suffice it to say, for some people at least, Moore's Law has chugged on like a beautiful slow train of time. With the Law in effect people at Apple, Sun, Oracle, etc. could count on things being better tomorrow than they were today in terms of features and functionality. So the new future, being less predictable, is a bit more foreboding.

I had my uh-ha moment on something like this in about 1983 when I was working on my master's thesis on Local Area Networks. This may not completely be a story about Moore's Law.. But I think it has a point.

Intel was working at the time to place the better part of the Ethernet protocol onto an Ethernet controller (in total maybe it was a 5-chip set). This would replace at least a couple of PC boards worth of circuitry that were the only way at the time to make an Ethernet node.


I was fortunate enough to get a Mostek product engineer on the phone to talk about the effect the chip would have on the market - in those days it was pretty much required that there were alternative sources for important chips, in this case Mostek. The fella described to me the volume that was anticipated over 5 or so years, and the pricing of the chip over that time. I transcribed his data points to a graph paper, and, as the volume went up, the price went down. Very magical moment. - Jack Vaughan

Sunday, April 3, 2016

Are dark pools coming to big data?

In Feb Barclays and Credit Suisse settled with the SEC which uncovered their nefarious high frequency manipulations in their dark trading pools. What’s with that? To go figure a good place is Flash Boys. But it is not an easy read.

Flash Boys delves into the netherworld of Wall St trading in the 2000s – where the devil is in the latency, and professional ethics is shit out of luck. Writer Michael Lewis paints a picture of an obsessively complex world of finance that attracts the underside of human aspiration. That echoes The Big Short, his earlier piece and a quite successful film in 2015.

But here the technological complexity that serves the finance engine rather gets the better of the story - ultimately Flash Boys pales a bit in comparison to The Big Short, as a result. We have a worthy hero at the center of the tale in Brad Katsuyama of Royal Bank of Canada, but the story can be stumbly as it tries to convey his efforts to uncover the culprits in the dark pools of high frequency trading – that would be the people that have the wherewithal to eavesdrop on the market,  spoof your intention, and buy stock in mass quantities at prices slightly lower than what you will pay to buy it from them. Brad could be the Cisco Kid that heads them off at the pass – if the road to get there weren’t so durn rocky.

 I’d suggest that many of the wonders of big data today resemble the wonders of stock market technology that front runs it. Publish and subscribe middleware and fantastically tuned algorithms are common to both phenomena.  Network latency can be the boogie man in both cases. Yes, while nearly no one was looking, online big data made a high frequency trading market out of advertising. The complexity is such that few can truthfully claim to understand it. And that lack of understanding is an opening for a con, as it was in The Big Short and the Flash Boys. When you believe in things that you dont understand then you suffer. - Jack Vaughan

Sunday, March 20, 2016

On the eve of the White House Water Summit

From On the Waterfront
References to The Manhattan Project ( for example, "We need a new Manhattan Project" to address fill in the blank)  are overdone. But we need something on the order of something to deal with water. California knows what it is like to live with this life blood threatened – Israel too. It is good cast attention on it – and that might happen to some extent this week as The White House Water Summit takes place.

One of the issues that must be addressed is data about water. It is not as good as data on oil, or stocks, but it should be. In the New York Times op-ed column Charles Fishman writes about water and data, and how weak efforts are to categorize, detail and report on water use.  

Imagine if NOAA only reported on weather every fifth day. That is analogous to the water reports of the U.S. government, according to Fishman, who says, where water is concerned, we spend five years rolling up a report on a single year. The biggest problem, says Fishman, is water's invisibility, here and globally.

He focused on the fact that water census is done only every five years - that gives us only 20% view of the total water experience. He points to Flint, Toledo, the Colorado basin as recent water crises and notes that adequately monitoring the water doesn't assure results, but that inadequately monitoring the water is criminal what with so much monitoring of Wall Street, Twitter Tweets or auto traffic. Any call for more monitoring of course is up against today's version of the 1800's Know-Nothing movement.

Fishman tells us that good information does three things: 1- it creates demand for more information; 2- it changes people's behavior; and, 3- it ignites innovation.

But what is next? My little look-see into this area uncovered an overabundance of data formats for representing data. It seems a first step for water data improvements might come with the application of  modern big data technology to the problem of multiple formats.


Sunday, March 13, 2016

Four tales of data


Scenario - Four young students on Spring Break go to Honduras in search of the roots of big data. Each comes back with a story. Together they tell of a struggle entwined. Look at data anew through the eyes of four groovy undergrads. Not yet rated.


Oceans of data - Hundreds of meters below the surface of the ocean, Laura Robinson probes the steep slopes of massive undersea mountains. She's on the hunt for thousand-year-old corals that she can test in a nuclear reactor to discover how the ocean changes over time. Big data is her co-pilot

https://www.ted.com/talks/laura_robinson_the_secrets_i_find_on_the_mysterious_ocean_floor

Lord, bless your data - Thomas Bayes writer F.T. Flam [I am not making this up] says set out to calculate the probability of God's existence. This was back in the 19th Century in jolly old England. The math was difficult and really beyond the ken of calculation of the time - until the recent profusion of clustered computer power came around the corner in the early 2000s.

https://en.wikipedia.org/wiki/Thomas_Bayes

Autopilot and the roots of cybernation - Elmer Sperry's son Elmer’s son, Lawrence Burst Sperry, nicknamed Gyro, was best known for inventing the autopilot utilizing the concepts developed by his father for the gyroscope. Alas he was lost off the Channel when only 31.

https://www.pinterest.com/pin/568649890431580951/

I was just trying to help people. The story by Veronique Greenwood tells us she wrote of her experience in a letter to the New England Journal of Medicine, and was subsequently warned by her bosses not to do that kind of query again. Assumedly HIPPA privacy concerns are involved – so get out the anonymizer and gun it, right?!

http://itsthedatatalking.blogspot.com/2014/10/calling-dr-data-dr-null-dr-data-for.html

-----

Data does baseball


Brought to you by MBA@Syracuse: Tools of Baseball Analytics

Data, Humans on the Side




Good things - The recent PBS show The Human Side of Data has much to recommend it. As someone who labors in the big data vineyard as a reporter and commentator, I appreciate its succinct high level view on one of the defining memes of now. I had the chance to speak with my colleague Ed Burns on the topic for the Talking Data Podcast, and thought I’d add to the discussion here.

There were some beautiful animated pictures of data crossing the world – be it air traffic in a normal day or one tweet spreading. A theme was that humans had to create narratives around the data (per Jack Dorsey), and to follow the trail from the data point to the actual real-world event (Jer Thorpe). What makes a culture collectively change its view of data? - one participant asks. What is the context of the data? – several query.




Cause for pause things - And that takes us to an underlying issue with the show.. which is that there is this unspoken progressive notion that we are getting better – that a Ivy League youngster who studied statistics and grew up with the Web for pablum, soda, and bread can do better than the dimwits that went before. It could be true. But correlation is not causation.To phrase a coin. -Jack Vaughan

Wednesday, March 9, 2016

On the machine learning curve

There was an article in the New York Times today I thought I might mention. "Taking Baby Steps toward Software that Reasons like Humana" (below) by John Markoff is a an articulate look at what I call machine learning. .

The story considers what is going on these days as a re-vitalization of artificial intelligence, which bloomed in the 1980s and then faded from headlines, and I agree. I think the story conveys, in a way, that there are some similarities.

The story looks doesn’t use the term 'machine learning' – tho it does mention pattern recognition (somewhat synonymous), deep learning and deep neural nets  .. which are fairly similar. What I think that emphasizes is that today's 'machine learning' is basically a new take on neural networks.

And as such machine learning faces hurdles -because flaws that stymied AI, still remain to be addressed. As Markoff writes "generalized systems that approach human levels of understanding and reasoning have not been developed."

What he doesn’t say is that the people that sell these things today tend to gloss over that, same as their counterparts 'back in the day.' That is not to criticize this particular work, which necessarily has a limited objective.

The story doesn’t use the term 'cognitive computing' either. But it talks about things – Q&A systems, natural speech processing - that combine with 'deep learning' to create cognitive computing.


Taking Baby Steps Toward Software That Reasons Like Humans

By JOHN MARKOFF MARCH 6, 2016 

Richard Socher appeared nervous as he waited for his artificial intelligence program to answer a simple question: “Is the tennis player wearing a cap?” The word “processing” lingered on his laptop’s display for what felt like an eternity. Then the program offered the answer a human might have given instantly: “Yes.” ....

Saturday, February 6, 2016

First Thought - When data goes mystic or, Data Myth

Thinking a bit about the past. And how we got here. Ruminating on the passing of Paul Kantner, and contemplating his dogged clutch to a futuristic transcended science fiction vision. And where a lot of my impressions of technology emanate initially from .. Shannon the telegraph messenger, Wiener the cyberneticist, the Alchemists, neural nets.

Have taken time too to track back and visit Lewis Mumford, who I probably have only read before but once or twice removed. His Myth of the Machine –while rich but scatter plotted – sets a backdrop for the present moment of machine learning and the rebirth of AI – much as Kantner’s Wooden Ships does.

Maybe by riffing on Mumford I can characterize my moody interpretation of technology better. I have been a mystical spin on technology for a very long time. And therefore I go back to another point in time to start over again via Lewis Mumford

Who's book is purply impenetrable – as much about science as Benedictine monks’ fermenting cheese or Micky Mouse’s sorcerer’s apprentice’s broom.

Mumford can see the days of yore that now escape us. He sees  the envy of the birds in the desire to conquer the air in the myth of Icarus, the flying carpet in the Arabian Nights, or the Peruvian  flying figure of Ayar Katsi. [The index to They Myth of Machine is like the debris of a cruise ship in the Sea of Saragossa.]

Mumford notes that literate monks like Bacon and Magnus (the ones on the cusp of alchemy and modern science- when clockwork elements began to show the path of automation) like da Vinci did visualize elements that are still fodder for the Astonishing Tale tokens of our day -  incredible flying machines, instantaneous communication, transmuting of the elements. He notes too how magically influential still the dynamo and the talking machine were as he wrote (late 1960s).

Mumford mentions Thomas More, and Utopia, Bacon and The New Atlantis, in depicting the machine itself as an alternative way of reaching heaven. Language for him is a disease with symptoms we see as dream symbols that become imposing metaphors that like myth rule. You can only filter what you see using the commanding metaphors of your age, he suggests. And the machine is that which bugs Mum.

What is the myth or master metaphor of today? The belly I labor bacteria like in is made of the myth of data. Data bears resemblance to penury as described [p.274] by the Mummer man [who by the way had not too kind comments for contemporary Marshall Mcluhan.] Ask the people who sued Netflix for using their data in an A/B machine learning contest.  What do they think of mystic data? Or, the myth of the machine? –J.V.