Monday, April 10, 2017

But,it worked on my machine . . .

Courtesy: A B. Normal
Functional MRI that shows human brain in action is growing in use within academic and research communities intent on uncovering the inner workings of the brain, says Russell Poldrack of Dept of Psychology, Stanford U. tells a lecture audience at Computer Information Science and Engineering series at NSF. But there are issues. The approach is costly, and it is not clear how reproducible big data analysis of the brain action is. Reproducibility of experiments using the scientific method is very important, but when R programs run on different machine configurations, different analytical results may emerge. Poldrack and others are working with software containers and virtual machines and registries to address this issue of computational reproducibility. - Jack Vaughan

RELATED


Sunday, April 2, 2017

The evening of a playing field?


Hand Of The Buddha
As the Republican congress capitalizes on the friendly Republican White House, it is overturning a lot of rocks, and passing legislation friendly to one or another among various corporate interests - such as the cable ISP and business.

This week, the House voted to upend Obama era FTC regulations that forbade ISPs from selling individuals' browser activity data. The group was to be put somewhat on the outside on the action of what is called big data - required to get formal permission from customers in order to sell browser histories to adtech markets - the ones dominated by Facebook and Google.

What's the difference between an ISP and a Google? Google provides a free service as part of a (admittedly murky) quid pro quo. You get free browser and free search - and you tacitly give them the right to use you as a datum. With the ISP, you pay them - and not with a lot of choice either, as they are more often than not a monopoly in your neighborhood.

The stakes ISPs stuck in the Internet are deep. It can hardly be said this legislation is the evening of a playing field. Several of the companies have pledged to ask for permissions of customers before selling their (anonymized) browser history. It's likely best alled a feel good gesture on the part of the people pulling the puppet strings of government these days. You know, "Monday, geld the EPA." "Tuesday, affirm the right to kill sleeping bears in National Parks." "Wednesday, toss the ISPs a big data bone." - Jack Vaughan

RELATED
http://www.zdnet.com/article/isps-were-not-going-to-sell-your-web-browsing-data/
http://continuations.com/post/158773876945/government-just-gave-your-isp-even-more-power-you
https://www.forbes.com/sites/thomasbrewster/2017/03/30/fcc-privacy-rules-how-isps-will-actually-sell-your-data
https://www.wired.com/2017/03/big-cables-case-selling-data-doesnt-hold/

SOURCES
Cards Against Humanity creator Max Temkin
Matthew Hogan, CEO at DataCoup
ALBERT WENGER, a partner at Union Square Ventures, author "World After Capital”
Dallas Harris, a policy fellow with consumer advocacy group Public Knowledge.


Wednesday, March 29, 2017

We have met the enemy and it is us



“Eclairage”, in Nouveau
Larousse Encyclopedia
Maybe Harvard, UCSD, Oxford, and every other institution hellbent for starting a School of Data Science should look at a paper published at Mick Jagger's alma mater. It looks at the first data revolution, described as the period in the 19th Century that saw the beginning of social sciences, surveying and statistical ledgermania, and the extent to which insidious bins, or categories, were created in the name of science now suspect

Many social categories were designed to control, coerce and even oppress their targets. The poor, the unmarried mother, the illegitimate child, the black, the unemployed, the disabled, the dependent elderly – none of these social categories of person is a neutral framing of individual or collective circumstances. They are instead a judgement on their place in modernity and material grounds for research, analysis and policy interventions of various kinds. Two centuries after the first big data revolution many of these categories remain with us almost unchanged and, given what we know of their consequences, we have to ask what will be their situation when this second data revolution draws to a close?

On many a dark hour I have pondered technology's impact on science...and it usually comes down to the fact that the existing social and economic order is almost definitely going to make its mark on the tools of progress, as our author's here write: Where they find reason to be fearful is the likelihood of "the continuity of ideologically informed notions of ourselves and others and the reproduction of such ideologies in and through our new digital environments." Or as Pogo would have it: We have met the enemy and it is us.

http://blogs.lse.ac.uk/impactofsocialsciences/2015/10/13/ideological-inheritances-in-the-data-revolution/

Tuesday, March 28, 2017

The science of data science

Harvard will launch a data science program. That's to ride the wave caused by advances in digitization, and the explosion thereof data. Principals point to examples: the explosion of genetics and genomics data in the life sciences, in molecular data, and the humanities as well.The objective is to glean knowledge from this data.

http://news.harvard.edu/gazette/story/2017/03/co-directors-of-newly-launched-harvard-data-science-initiative-discuss-new-era/

Monday, December 26, 2016

Big Data Pyscho

What's going on behind that Facebook quiz? Cambridge Analytica gets a look at personality scores and, thanks to Facebook, gains access to their profiles and real names. The firm sells analytics, data, profiles. It's what they call big data psychographics.

The big data world that I work in has a sort of a hangover going on right now – some dizzy blur after a long period of heady growth. Something happened in the way of a Godsmack, on the road to Antioch, in the shape of Brexit and Donald Trump’s surprising rise. For many a wonks, it is no surprise.

Anyone who looks objectively at the data analytics of this or any day knows there is plenty of room for mistakes. As with any hot technology, there's also a lot of space for hyperbole. The journalists’ job is to keep an eye on the chance of failure at the same time he reports the assertions of people making waves with that hot technology.

It is therefore a good time for us to consider the recent article penned by Sue Halprin for the New York Review of Books, which starts with a vignette describing the number of data points - 98 - that Facebook collects on each of a gazillion members. There is some hilarity, as the writer uncovers the false persona a Facebook might construct about here – or you, or me.

Halpren learns by digging into Facebook that the uber site mistakenly views here a guy, probably a gay guy because she tend to evince gay guy characteristics. That is one that algorithm hath writ because Halpren reads The New York Times (and the New York Review of Books.

She writes that the big data proponents want us to believe that data analysis will deliver to us a truth that is free of messiness or idiosyncrasy. Truth is full of such, but humans are prepared to gloss over.

Data science today tends toward the reductive – it puts people in compartments. Studies prove this! And underlying the whole big data wave is advertising. Which has always had an aspect of whimsy and subterfuge? In the days of old, we sent our children to school to learn this to protect them. To often now the kids are sent to the better schools to figure out how to exploit the subterfuge. According to Halpren, we need to recognize the fallibility of human beings is written into the algorithms that they write. - Jack Vaughan

They Have Right Now Another You -  NYRB


Thursday, December 1, 2016

Hedger with time on hands bets he can improve boffin computing

Retired billionaire hedge fund manager James H. Simons will fund a research institute to apply advanced computing techniques to scientific problems.

New York Times story by Kenneth Chang, says Simons feels he has identified a weakness in academia, where science students in research so often turn to computer programming only because it is necessary to their research. 

As they move up or out of their profession their software tool creations go too. No V.2.'s 

The software that derives from the “Flatiron Institute’s” efforts will be made available for all scientists, it is said. Up first: Computational biology. Big data analytics seems to be a special focus. 

I am not sure about the premise. So many great programmers started as students in the sciences! So much in high performance computing was driven by academic scientist too. 

Many of the recent advances in big data have happened beyond the ken of science and academia, it’s true. But Spark? Machine learning? Well, much of that work came out of the academy. 

From a press release:

The FI is the first multidisciplinary institute focused entirely on computation. It is also the first center of its kind to be wholly supported by private philanthropy, providing a permanent home for up to 250 scientists and collaborating expert programmers all working together to create, deploy and support new state-of-the-art computational methods. Few existing institutions support the combination of scientists and programmers, instead leaving programming to relatively impermanent graduate students and postdoctoral fellows, and none have done so at the scale of the Flatiron Institute or with such a broad scope, at a single location...The institute will hold conferences and meetings and serve as a focal point for computational science around the world.


Would it be good to have a new effort that served as a new hub for advances in scientific computation? Yes. This will be an interesting development to watch. – Jack Vaughan

Tuesday, October 25, 2016

Crunch time, Capt.