Sunday, December 3, 2017

Paradise Graph Papers


The Paradise Papers files expose offshore holdings of political leaders and their financiers as well as household-name companies that slash taxes through transactions conducted in secret. Financial deals of billionaires and celebrities are also revealed in the documents. 1.4 TB of data – 13.4 million documents – includes information leaked from trust company Asiaciti and from Appleby, a 100-year-old offshore law firm specializing in tax havens as well as information leaked.more to come


Related




https://linkurio.us/blog/big-data-technology-fraud-investigations/

Friday, October 13, 2017

Does data make baseball duller?

Let's not talk quality of life and data, lets talk baseball and data. Moneyball was an eye opener in the rise of big data analytics as a popular meme. And why not? It had Brad Pitt. Well the movie did. It showed a guy thinking outside of the box could re-imagine the game. The hell with 'he looks like a ball player' hello to can he take a walks? For a small market team - a tonic. But now we are seeing a great downside of worshiping at the altar of data: Really boring baseball. Removing too many pitchers too soon...Embracing strikeouts...Avoiding ground ball and liner hits...focus on homer... Still one wonders if some of these move do auger obvious counter moves for those outside of box thinkers of today... in the face of elaborate boring shifts... why not bunt?

https://www.wsj.com/articles/the-downside-of-baseballs-data-revolutionlong-games-less-action-1507043924

Sunday, September 3, 2017

Forensic analytics

While at Bell Labs in the 1980s, Dalal said, he worked with a team that looked back on the 1986 Challenger space shuttle disaster to find out if the event could have been predicted. It is well-known that engineering teams held a tense teleconference the night before the launch to review data that measured risk. Ultimately, a go was ordered, even though Cape Canaveral, Fla., temperatures were much lower than in any previous shuttle flight. A recent article looks at the issues with an eye on how they are related to analytics today.

http://searchdatamanagement.techtarget.com/opinion/Making-connections-Big-data-algorithms-walk-a-thin-line

Saturday, August 19, 2017

Working notes - The profession of data

Working - The profession of data The software profession took clear steps forward during the 1960s. Software had become essential to U.S. military defense, not to mention IBM, and, with the imperative to go to the Moon, there appeared money and interest enough for methodologists to ponder what type of professional processes could lead to predictable, successful and repeatable outcomes.

More

https://moontravellerherald.blogspot.com/2017/09/data-science-and-dev-ops-thoughts.html

Sunday, June 18, 2017

Data, like people can lie

Three cheers for the West Virginia University team whose research caused the net to enclose upon the existential Euro varmints of VW. Who used software to neuter U.S. emission tests (and probably laugh about it over bears). Data, like people can lie. Data exists within the construct of the civilization around it -  What's the bet that a few Euro's in an off shore bank will cause Trump and Bannion to cut research funding in to emissions? - Jack Vaughan

https://www.nytimes.com/2017/05/06/business/inside-vws-campaign-of-trickery.html

Monday, June 5, 2017

It couldnt look bleaker unless your name is Meeker

Mary Meeker's annual report for Kleiner Perkins on the status of Internet commerce is always interesting - chock full of data and packed with gleefully greedy West Coast VC perspective.  Let's look at some highpoints out of the 150-plus Power Point Slide opus.

Do you smell the fear in the Fortune 500? Smells like they could use some baby wipes. They can get them from Amazon, actually, which trails only Huggies and Pampers for online market share. For Duracell, it is deep doodoo, as Amazon surpasses the check out counter champ entirely  - on the Web. All that marketing and technology innovation - not too mention shelf shoving -- over many years seems for little or naught. (Off beat: I worked for 6 months at a drug store on 34th St in the 1970s and among the thing I learned was: "You cannot keep Pampers on the shelf" Translation: Shit happens.)




The sound of foot prints echoes double in network television where the biggies are flat or in decline, but Netflix is on a skyrocket up.



And disruptors (the Internet advertising vehicles  that disrupted convention media) can be disrupted too, especially if they face big hungry disruptors  such as Facebook and Google. They who grow ad revenue in double digits while Everybody Else flatly contests the small pie leftovers.





Maybe Facebook and Google are as much beneficiaries of an underlying sea change in Internet usage..as of anything else. While desktop and Laptop Internet use has been steady or in slight decline over the last eight years, Internet   time on the smartphone side has been vaulting forward stridently. What is different about mobile? The message might be real real concise, the ambiance more transactional, and the market more consumerish.



--
Related
http://www.kpcb.com/internet-trends

Sunday, June 4, 2017

Technology and Koyaanisqatsi

By U.S. HUD 
Koyaanisqatsi is a Hopi concept brought to wide attention as the title and central motif in Godfrey Reggio's 1983 movie. There are a lot of ways of looking at the meaning of the word, but the one that Reggio lit upon can stand, as it is a pretty useful way of framing the modern world. Koyaanisqatsi represents "Life out of balance," and, to me, that representation aptly depicts 1000-plus years of rising science and technology threatening human values.

I suppose there is more - that there is a unified theory gluon waiting, something like Capital, or Greed. But let's start with some simplicity.  Asking: Where is harmonious technology and humanism to be found, and where and why does Koyannisqatsi begin to emerge. - Jack Vaughan

Tuesday, May 23, 2017

Partners HealthCare: "I sing the General Electric"

Massachusetts-based Partners HealthCare partnered with GE Healthcare last week on a projected 10-year collaboration to bring greater use of AI-based deep learning technology to healthcare. across the entire continuum of care. The collaboration will be executed through the newly formed Massachusetts General Hospital and Brigham and Women’s Hospital Center for Clinical Data Science and will feature co-located, multidisciplinary teams with broad access to data, computational infrastructure and clinical expertise.

The deal is something of a stake in the ground, as GE moves its HQ up from Conn to Boston. In the long term, Partners and GE hope to create new businesses around AI and healthcare.

The initial focus of the relationship will be on the development of applications aimed to improve clinician productivity and patient outcomes in diagnostic imaging. It will be interesting, as more details emerge, to see how this effort compares or contrasts with efforts such as IBM Watson Imaging Clinical Review -- a cognitive imaging offering from that company's Watson Health operation as part of a collaborative of 24 organizations worldwide. - Smiling Jack Shroud


http://www.partners.org/Newsroom/Press-Releases/Partners-GE-Healthcare-Collaboration.aspx

http://www-03.ibm.com/press/us/en/pressrelease/51643.wss

Tuesday, May 2, 2017

30 Second History of Corporate Data Processing

I just finished some research on definition of data. Hope to point it out when it happens.


http://dispatchtelegraph.tumblr.com/post/160228081249/evolution-of-the-new-york-times-front-page
What I learned: Data sort of went into a new era with Claude Shannon's work in Boolean math and data compression and cryptography which was near concurrent with birth of transistors and the advent of electronic and magnetic representation of data signals in and around the computer. Of course there was a sidestep (and homage to Hollerith and Jacquard) with punch cards - which carried the anti-conformist message of the time: Don't bend fold or mutilate me.

Next up were databases that organized data in an increasingly efficient way. Then relational databases which had such use in business, and SPSS, which had use in statistics, sociology and academia.

All along the way, data becoming more of a commodity  - and a series of professions building up around its evolution as a commodity. Until the advent of big data - that being unstructured data on the main - data coming from outside of the organization - data being created by consumers as they do their activity.

Until you have today's world, with data as a business - in the cases of companies like Google and Facebook, being the total basis of their business -- and you have concerns about data privacy. - Jack Vaughan

Saturday, April 22, 2017

Shannon, information and noise

Dr Disruption Signal Sign
This fits with things coming up again. I wrote it for ITWorld when Claude Shannon died. Also has re-run on MoonTraveller.

OBITUARY-APPRECIATION - We live in an age highly influenced by information technology. For many people, it has become the basis for a life's work. For a few, at least, it has meant great fortunes.
Most of the great technologists who set the stage for this era -- for example, Norbert Weiner, Vannevar Bush, Alan Turing, and John von Neumann -- are long dead. But Claude Shannon, the great theorist who formed the most basic tenets of the information age, survived until last weekend. He died at 84 last Saturday in Medford, Mass., after a long fight with Alzheimer's disease.

Shannon's work, like his passing, may not be widely noted among many who have followed him in the information, technology, and e-commerce industries. But there is little question that he is the chief progenitor of information theory and modern digital communications. Shannon's mathematical thinking and writing laid the groundwork for most of today's information technology industry. He is the man who discovered 1's and 0's in electronic communication.

Shannon was born in Petoskey, Mich., and grew up in Gaylord, Mich. He worked as a messenger for Western Union while in Gaylord High School, and attended college at MIT, where he was a member of Tau Beta Pi.

Although the algebra of digital binary bits was first uncovered by mathematician George Boole in the mid-19th century, it was Shannon who saw the value of applying that form of logic to electronic communications. As a student of Vannevar Bush's at MIT in the 1930s, he worked on the differential analyzer, perhaps the greatest mechanical (analog) calculator. His paper, "A Symbolic Analysis of Relay and Switching Circuits," which led to a long association with Bell Laboratories, laid out Shannon's theories on the relationship of symbolic logic and relay circuits.

While at Bell Labs, Shannon wrote the landmark "The Mathematical Theory of Communication." The information content of a message, he theorized, consists simply of the number of 1's and 0's it takes to transmit it. In a real sense, Shannon conceived of the "bit" that is now so widely used to represent data.

Later, he became a professor at MIT. His students included Marvin Minsky and others who became notable in the field of artificial intelligence. While Shannon's thinking could captivate academicians, it was equally appealing to practical engineers.

Shannon's work led to many inventions used by both technology developers and end users. His theories can truly be described as pervasive today.

When I was young, Shannon's work was a tough nut to crack, but it certainly was intriguing. As a high school boy, I was interested in the future -- maybe more so than now, when I live and breathe and work in what that future became. Grappling with Shannon's basic information theories was part of my education about the future.

Growing up in a Wisconsin city across the lake from Shannon's birthplace, I tried to plow through the town library as best I could. I wanted to learn about computers, automation, and the combination of the two that was known in those days (the 1960s) as cybermation. I discovered for myself -- by chance, really -- that the fundamental elements of those ideas were Shannon's inventions.

For the better part of Shannon's life, analog communication ruled. Of course, his greatest achievement was visualizing digital communication. Much of his greatest work revolved around defining information in relation to "noise," the latter phenomenon being quite familiar to anyone who often tried desperately to home in on radio signals before digital communication filters came into being. I came to appreciate that aspect of Shannon's work later on when, as a journalist, I had the opportunity to learn and write about digital signal processing.

Then I found out that Shannon had laid the groundwork for modern error correction coding, an essential element of things like hard disk drive design and digital audio streaming, and probably many things yet to come.

Day and night, data, messages, music, and more swirls around us -- all made possible to some extent by the idea of communicating electronically in 1's and 0's. It is something to think that a Western Union messenger could have conceived of this new world. - Jack Vaughan, originally published in ITWorld.com

Wednesday, April 12, 2017

The road to machine learning

In some spare time last year I worked on reworking a story I'd written for TechTarget on the topic of machine learning and related data infrastructure. I spoke with a few users in the story, so the reworking focuses on their end-use applications, while also covering vendors and technology employed. I did this as a Sway multimedia presentation. I faltered in final completion, in that an audio track was the proverbial bridge too far. So I call it a failure (experience tells us to jump from projects that are "projects of one" but thought it wa worth posting here, for experimental purposes. To read the story it is based on please go to "Machine Learning tools pose educational challenges for users." - Jack Vaughan


Monday, April 10, 2017

But,it worked on my machine . . .

Courtesy: A B. Normal
Functional MRI that shows human brain in action is growing in use within academic and research communities intent on uncovering the inner workings of the brain, says Russell Poldrack of Dept of Psychology, Stanford U. tells a lecture audience at Computer Information Science and Engineering series at NSF. But there are issues. The approach is costly, and it is not clear how reproducible big data analysis of the brain action is. Reproducibility of experiments using the scientific method is very important, but when R programs run on different machine configurations, different analytical results may emerge. Poldrack and others are working with software containers and virtual machines and registries to address this issue of computational reproducibility. - Jack Vaughan

RELATED


Sunday, April 2, 2017

The evening of a playing field?


Hand Of The Buddha
As the Republican congress capitalizes on the friendly Republican White House, it is overturning a lot of rocks, and passing legislation friendly to one or another among various corporate interests - such as the cable ISP and business.

This week, the House voted to upend Obama era FTC regulations that forbade ISPs from selling individuals' browser activity data. The group was to be put somewhat on the outside on the action of what is called big data - required to get formal permission from customers in order to sell browser histories to adtech markets - the ones dominated by Facebook and Google.

What's the difference between an ISP and a Google? Google provides a free service as part of a (admittedly murky) quid pro quo. You get free browser and free search - and you tacitly give them the right to use you as a datum. With the ISP, you pay them - and not with a lot of choice either, as they are more often than not a monopoly in your neighborhood.

The stakes ISPs stuck in the Internet are deep. It can hardly be said this legislation is the evening of a playing field. Several of the companies have pledged to ask for permissions of customers before selling their (anonymized) browser history. It's likely best alled a feel good gesture on the part of the people pulling the puppet strings of government these days. You know, "Monday, geld the EPA." "Tuesday, affirm the right to kill sleeping bears in National Parks." "Wednesday, toss the ISPs a big data bone." - Jack Vaughan

RELATED
http://www.zdnet.com/article/isps-were-not-going-to-sell-your-web-browsing-data/
http://continuations.com/post/158773876945/government-just-gave-your-isp-even-more-power-you
https://www.forbes.com/sites/thomasbrewster/2017/03/30/fcc-privacy-rules-how-isps-will-actually-sell-your-data
https://www.wired.com/2017/03/big-cables-case-selling-data-doesnt-hold/

SOURCES
Cards Against Humanity creator Max Temkin
Matthew Hogan, CEO at DataCoup
ALBERT WENGER, a partner at Union Square Ventures, author "World After Capital”
Dallas Harris, a policy fellow with consumer advocacy group Public Knowledge.


Wednesday, March 29, 2017

We have met the enemy and it is us



“Eclairage”, in Nouveau
Larousse Encyclopedia
Maybe Harvard, UCSD, Oxford, and every other institution hellbent for starting a School of Data Science should look at a paper published at Mick Jagger's alma mater. It looks at the first data revolution, described as the period in the 19th Century that saw the beginning of social sciences, surveying and statistical ledgermania, and the extent to which insidious bins, or categories, were created in the name of science now suspect

Many social categories were designed to control, coerce and even oppress their targets. The poor, the unmarried mother, the illegitimate child, the black, the unemployed, the disabled, the dependent elderly – none of these social categories of person is a neutral framing of individual or collective circumstances. They are instead a judgement on their place in modernity and material grounds for research, analysis and policy interventions of various kinds. Two centuries after the first big data revolution many of these categories remain with us almost unchanged and, given what we know of their consequences, we have to ask what will be their situation when this second data revolution draws to a close?

On many a dark hour I have pondered technology's impact on science...and it usually comes down to the fact that the existing social and economic order is almost definitely going to make its mark on the tools of progress, as our author's here write: Where they find reason to be fearful is the likelihood of "the continuity of ideologically informed notions of ourselves and others and the reproduction of such ideologies in and through our new digital environments." Or as Pogo would have it: We have met the enemy and it is us.

http://blogs.lse.ac.uk/impactofsocialsciences/2015/10/13/ideological-inheritances-in-the-data-revolution/

Tuesday, March 28, 2017

The science of data science

Harvard will launch a data science program. That's to ride the wave caused by advances in digitization, and the explosion thereof data. Principals point to examples: the explosion of genetics and genomics data in the life sciences, in molecular data, and the humanities as well.The objective is to glean knowledge from this data.

http://news.harvard.edu/gazette/story/2017/03/co-directors-of-newly-launched-harvard-data-science-initiative-discuss-new-era/