Sunday, April 26, 2015

Rah rah, Data, Go, go, go

SwarmDroneDessertBoing
Data as an enthusiasm or even hobby is in the air. As noted in an Economist article (Briefing: Clever Cities: The Multiplexed Metropolis –Sept 7 2013, p.21. ) But does close inspection of the results to date tell us the enthusiasm is warranted? Is this truly like the introduction of electricity to the city? 

One young Dutchman developed a mobile app that tapped into open data to predict the best and easiest areas of city to rob. This was done thankfully in this instance to kindle debate. The Smarter City has a Darker Side, and not just in SciFi.

Anyway, who benefited most from the introduction of electricity, and if data is as powerful a game changer, who will benefit most on this go-round? "The importance of political culture will remain" the writer opines. And it is true. The political culture likely remains more important than any transient technology advance - in terms of how the pie gets cut up.

Human behavior is good and bad. If there is a bad side, there is an app for that.

Saturday, April 11, 2015

Behind the music – Spark and the PDP-11

The DataDataData (Itsthedatatalking) blog is meant to focus on data today – not to rehash my history of computing. But sometimes it veers that way, and I will just be holding on to nothing but the wheel. But I digress.

Apple scruff at The Smithsonian
Spark is the latest new shiny object in data processing. That said, I don’t mean to belittle its potential. The folks that fashioned it in the vaunted AMPLabs at UC Berkeley are supersmart, and very aware of what the advent of multicore microprocessors meant to computing: that new means to big clusters of  parallelism were available, if only the complexity could be abstracted downwards in clever libraries and runtimes.

People selling Spark come in your door selling Hadoop. Which has had plenty of publicity and is borderline ready for primetime. Now once in there, they may mention  you can toss Hadoop, but only if they think you may cotton to that.  After writing about Hadoop for about two years I took some care in approaching Spark.  Finally some words from way back came back. Please, let me digress some more.

Long ago and far away I sat with my boss discussing the news. The news on that day in 1992 was the ouster of Digital Equipment Corp.'s co-founder Ken Olsen. His departure was an inflection point along a trail that saw DEC go from being a gutsy Maynard, Mass. mill town startup to being a serious threat to IBM's industry leadership to being a forlorn merger candidate.

Like those in other editorial offices, my boss and I wondered what went wrong. What went wrong was the company got confused about what business it was really in. Seems absurd, but it can happen.
DEC's Olsen did not like the PC or Unix, two very innovative industry trends that his subordinates learned to basically eschew.  Missing on the move to small personal computers was especially ironic, as DEC itself rose in the 1960s on the back of minicomputers that downsized capabilities of the larger, then-dominant mainframe computer. Anyway, on this particular day I was especially interested to see my editor's take on this. That was because his experience went beyond running a magazine called EDN.

You see, as a graduate student, Jon Titus's had been in the vanguard of what came to be known as microcomputers, or PCs.  A July 1974 Radio Electronics issue that featured Titus's 8088-based "Mark-8 Personal Minicomputer" kit predated Popular Electronics' Altair 8088 cover story by six months.

In Cambridge, Mass., Harvard college student Paul Allen picked up a copy of the latter magazine, brought it back to the dorm to share with Bill Gates, and a new era of computing was off and running. Note that Titus and the Radio Electronics editors called the Mark-8 a personal minicomputer. So, Titus had a unique perspective on Ken Olsen's quandary.

"DEC came to think they were selling minicomputers," Titus said. "But what they were selling was computing."

Anyway- I link below to the full story on this which ran on SearchDatamanagement.com. I'd like to add here what a great boss Jon Titus was for me. He stood by me, more than once, which I never will forget. My spousal unit and I got to Washington last week. We went to the Smithsonian museum (actually, just two days after this story went live) and were told that the computer exhibit was closed for repairs (a lot of people can relate to that, ay?!) so we did not see the Mark-8 on display. Instead there was the computer that has, and maybe rightfully so, gained the brunt of the fame.
A cruel old engineer.

That is the Apple II of Steve Wozniak and Steve Jobs.  A woman came by and asked the air: "Is that the first computer?" No, said I, trying to be courteous, "the first computers were as big as rooms - that is what many people consider to be the first personal computer." Sorry, that's it for now - I got to go digress. – Jack Vaughan

Read Apache Spark meets the PDP-11 -- in the end, it's all about the processing – SearchDataManagement.com, Mar. 31, 2015 http://bit.ly/1Im9n1l

Wednesday, April 8, 2015

Give me Algorithmic Accountability Or

Give me Algorithmic Accountability or give me… ah, what is the alternative again?

I thought Steve Lohr's article in yesterday's New York Times was worth pointing out, as it boils up a larger issues from the flotsam and jetsam of the big data analytics parade. Oneline ads, the killer app (to date) for big data and machine learning re but a Petri dish, he says. After all, if the wrong ad is served up, the penalty is mild. But, he writes, the stakes are rising. Companies and governments will churn big data  to prevent crime, diagnosis illness, and more. Why just the other day JP Morgan said it could spot a rogue trader before he-she went rogue.

The algorithms that do the decisions may need more human oversight, the writer and others tend to suggest. Civil right organizations are among those suggesting. An other is Rajeev Date, formerly of the Consumer Financial Protection Bureau. The story focuses on the notion of Algorithmic Accountability (meeting tonight in the church basement, no smoking please) as an antidote to brewing mayhem

IBM Watson appears in the story. It is hard to get a handle on Watson, but one thing is crystalline; that is, that the mountains of documents is growing beyond managers’ capacity to understand, and that Google is paling under the weight. Watson is meant to do the first cut on finding a gem in, for example the medical literature – reading ‘many thousands of documents per second.’ Along the way, a few researchers may lose their jobs, but the remaining managers will need coffee and servers are wanted.

Havent heard for a while of Danny Hillis – he coined the Thinking Machine back in the day. The original cognitive computer? Or was that the old Ratiocinator (but I digress). Hillis says data storytelling is key. To, like old man Chaucer, find narrative in the confused data stream. If the story teller had a moral compass that would be an additional positive factor, if you take Louis Berry’s word for it. He is cofounder of Earnest, a company that has staff to keep an eye on the predictor engine output.

Opacity would be good, Lohr concludes, as Gary King, director of Harvard’s Institute for Quantitative Social Science joins the narrative. The Learning Machines should learn to err on the side of the individual in the data pool – if that would happen you would get that bank loan, that might be a little iffy. Rather than have a fairly innocuous money request rejected. George Bailey would be the patron saint of the Moralistic Data Story Telling Engineer.

I am trying to think of a case where the owners of the machines programmed them that way .. but parted-lipped Jennifer Lawrence is in a Dior ad contiguous with Lohr’s Maintaining a Human Touch As the Algorithms Get to Work (NYT, Apr 7, 2015, p. A3) and my train of thought has left the station.

Data science should not happen in the dark. We have in fact aborning a classic humanization-computerization dilemma. Academia and associations, mobilize! – Jack Vaughan, Futurist


[Imagine Betty Crocker working a conveyor belt where algorithms are conveyed. I do.]