Categories
Cloud Education Google Internet2 Network Reading Technology WiscNet

Latest read: Reliability Assurance of Big Data in the Cloud: Cost-Effective Replication-Based Storage

While focused on the task of generating data for astrophysics Reliability Assurance of Big Data in the Cloud is a worthy read when focused around designing cloud service contacts.
Reliability Assurance of Big Data in the CloudThe work of authors Yang, Li and Yuan surround capturing big data reliability, and measuring disk storage solutions including from noted cloud vendors.

Their work at Centre for Astrophysics and Supercomputing at Swinburne University of Technology focused on reducing cloud-based storage cost and energy consumption methods.

They also share the impact of multiple replication-based data storage approaches based upon Proactive Replica Checking for Reliability (PRCR). That was very interesting in their research data gathering.

I found Reliability Assurance of Big Data in the Cloud also supports moving data into the cloud across advanced research networks including Internet2.

Processing raw data inside the data center impacts network models (based upon available bandwidth) in their work. Their research gathers and stores 8 minute segments of telescope data that generates 236GB of raw data. By no means in the petabyte stage (yet) but it still sets a solid understanding of contractual demands on big data cloud storage.

My interest peaked around impacts developing knowledgeable contracts for cloud services. Their background regarding data gathering and processing should influence procurement contract language. This is even more applicable when applied to petabyte data sets and the SLAs regarding data reliability requirements. Never leave money on the table when scaling to the petabyte range. Must read for purchasing agents and corporate (and university) CPSMs.

Categories
Cyberinfrastructure Design Education Google Innovation Internet2 Network OpenSource Reading Technology Vietnam War

The Vietnam War: Unstructured data reporting and counterinsurgency

After reading No Sure Victory: Measuring U.S. Army Effectiveness and Progress in the Vietnam War I could not help but think about the consequences failed data reporting by MACV can serve a historical lesson for re-implementing or adjusting campus data reporting systems.

data reporting during Vietnam War
Data report tickets used by MACV in the early stages of The Vietnam War

Key stakeholders on campus should easily state their reasons for data collection and reporting. No Sure Victory benefits campus units by revealing an early, dare I say Big Data approach to unstructured data reporting and delivering actionable data.

Today we immediately understand Google’s Compute Engine or an Amazon Elastic MapReduce cloud for this demand.

Universities can thrive with diverse reporting teams. Working with Institutional Research and striving to improve enrollment and retention efforts are key goals. Yet important roles are filled with student workers. Here unstructured data often fragments over mismanagement. Many ad hoc Microsoft Excel documents are created without data governance and become silo’d from the campus data warehouse. Key stakeholders on any campus including CIOs, IR Directors, Research staff, Program Directors, campus data reporter writers and student workers. Even seasoned campus data report writers are not leveraged to streamline actionable data insights.

No Sure Victory brings to light a tragic failed data reporting implementation by Secretary of Defense Robert McNamara in addressing a war in Vietnam. The was his reputation as one of The Wiz Kids, the World War II Statistical Control unit that analyzed operational and logistical data to manage war.

Categories
Design Education Innovation Network Reading Technology

Latest read: The Data Science Handbook

The recent pre-release of The Data Science Handbook is a fast, easy read. There is nothing better in business today than the still exploding market of data science. While some marketing statements indicate many are trying data science, here are the voices of recognized data science leaders. I have read my share of data science and big data books as well but like the direction of this pre-release.

The Data Science Handbook Pre-ReleaseMaturing technologies like Hadoop and even MapReduce prove yesterday was the time for every organization, business unit and non-profit to understand how data science is fundamentally changing the game.

Data Science hits your data sweet spot due to the ability of large systems to process your data in real-time. Notice how Microsoft is acquiring data science companies?

Data Science was just in its early stages not more than 10 years ago. Yahoo and Google helped move this forward. Even “legacy” companies like Sears Holdings understands the impact of MapReduce and Hadoop, they are well outside Silicon Valley. Just wait until some great advancements for public health are established by non-profits as a result of implement data science to forecast their business.

There is a great deal of excitement as the full release publication date inches closer. Cannot wait to see this book ship.

Categories
Cloud Education Innovation Network Reading

Latest read: Disruptive Possibilities

There can be no doubt today that Big Data has changed everything. Jeffery Needham has written a great book Disruptive Possibilities: How Big Data Changes Everything. Its all about the impact of Hadoop in the cloud as the ultimate computing platform.

Disruptive Possibilities: How Big Data Changes EverythingI was very pleased reading his work when I found his personal story at the end regarding the application of Hadoop in neuroscience as a method to address Sturge-Weber Syndrome. We know it as having a port wine stain on your face.

His story made me appreciate his desire to throw Hadoop at the datasets that may one day reveal a cure for this syndrome. I am amazed at how he described reteaching himself not only how to walk down a hallway, but train his body to hit a baseball after losing vision in his right eye.

My favorite segment of Disruptive Possibilities is chapter five: When Clouds meet Big Data. Needham also makes a very easy read in chapters one to four where he lays the foundation based upon his deep experiences with Hadoop. And yes you can run Hadoop off laptops found in a dumpster.

There is much to learn in university circles about the impact of Disruptive Possibilities and Hadoop.  Worry not its not the computing or research units that I am thinking about but rather HR, Admissions and just about every other campus unit that would benefit from moving their data into a Hadoop cluster in order to data mine their future.

Categories
Cloud Cyberinfrastructure Network Reading Technology

Latest read: Online Payments Risk Management

Online Payments Risk Management is certainly a hot topic. The 2013 holiday data breach at Target and more recently, a new large data breach at Home Depot the need for organizations to understand Online Payments Risk Management is more important today truly than ever before.
online payments risk managementI think there is no better way than for companies and payment card providers to step back and acknowledge many “security” measures are not effective today in combating cyber crime.

Ohad Samet’s book is a great introduction to payment risk management from multiple angles and can be a good base document to build upon in bringing PCI compliance efforts to online payment websites.

It may even be interesting to see how Samet positions of Loss over Fraud.  The implications can be rather surprising.

Samet has organized this book into logical sections regarding approaches and the use of analytics to optimize tracking losses while also addressing the role of the organization and the people implementing secure transactions.  Regardless of its 2013 publication, section 3 on Tools and Methods provides solid, industry tested solutions that should be reviewed annually.

That said its time to roll up your sleeves and begin protecting consumers.