Data as Cash

Is this helpful?

What is your data worth?  Do you know the worth of your data?  I bet if there were a privacy breach I bet you would find out at that moment.  Otherwise, what does the data actually do?  Information storage, for information’s sake, is a copy it is also hoarding.  Information assembled in an intelligent high, affinity fashion is worth a lot these days.  

i will say it is worth more than the software that makes it so.

By definition, what does that mean for the software that accesses that data?  Is it as important as the actions taken on the data?  Is it really about the application itself or the results of the information that the application creates across multiple data stores?

To take this a step further, I believe it is no longer about the data sets in situ but how multiple data sets can be fused together with context, geo-spatial and behavioral information.  It is no longer about the software but what the software does with the data – and lots of it.

Unless you live under a rock, you know that distributed map reduction systems are now the norm for any massive-scale compute data-intensive service.  However, there is a catch here.  Digital lifestyles are one massive data collection opportunity.   These digital breadcrumbs are the stuff of congressional hearings on privacy.  The problem is how we can place a recurring revenue stream on this data.  We inherently know it is the new cash.  Distributed data that describes itself is the DataSpace.  The DataSpace is the vault or vaults which create useful information.  This fluidity of data is a very important concept in the proper functioning of Data as a Service (DaaS).

Knowledge Fusion and Assumptions

Knowledge Fusion is a term that historically draws on many dataspace concepts and techniques developed in other areas such as artificial intelligence (AI), machine discovery, machine learning, search, knowledge representation, semantics, and statistics. It is starkly different from other decision-support technologies in as much as it is not purely retrospective in nature. For example, language-based ad hoc queries and reporting are used to analyze what has happened in the past, answering very specific business questions such as, ‘How many widgets did we sell last week?’ When using these tools, the user will already have a question or hypothesis that requires answering or validation. Knowledge Fusion is very different as it is forward-thinking and aims to predict future events and discover unknown patterns and subsequently build models – these models are then used to support predictive ‘what-if’ analysis, such as, ‘How many widgets are we likely to sell next week?’ based on the context and meaning of the data as it is utilized.  KF also allows this context and meaning to infer further linkages from the initial query.  How is this knowledge fusion processed and accessed?  Why is it needed?  Let’s start with some technical assumptions from the BOOM paper:

1. Distributed systems benefit substantially from a data-centric design style that focuses the programmer’s attention on carefully capturing all the critical state of the system as a family of collections (sets, relations, streams, etc.) Given such a model, the state of the system can be distributed naturally and flexibly across nodes via familiar mechanisms like partitioning and replication.

2. The key behaviors of such systems can be naturally implemented using declarative programming languages that manipulate these collections, abstracting the programmer from both the physical layout of the data and the fine-grained orchestration of data manipulation.

Based on these assumptions, I would like to add some  observations from my experiences:

3. The more data we have the easier the machine learning and data mining algorithms.

4. Clean Data is paramount.  85% of the time spent on creating value out of data is cleansing and creating views and modeling.

Another platform that I believe will greatly see adoption is the folks at Systap, LLC with BigData.  Check it out here. These guys are really looking at the future of fully distributed linked data and massive Resource Description Frameworks.  They are also taking into account concerns such as high availability, transactional processing, B+ trees, and sharding.  i hope to see this trend in the enterprise.

Revenue Models for DataSpaces

A caveat emptor at this juncture:  One of my biggest issues with semantic intelligence, knowledge fusion, knowledge discovery, machine learning and data mining is that people believe it is a MAKE IT SO button! Click here and all your dreams will come true. People believe it will give them business strategy answers or generate the next big thing.  Folks y’all still have to think.

While this is all well and good. What do we base the importance of data upon?

  • Revenue Per Employer (RPE)?
  • Keywords Per Transaction (KPT)?
  • Bulk Rate DataLoad (BRD)?
  • Business Lift Based Query (BLQ)?
  • Affinity Per URI (APU)
  • Insert your favorite monetization acronym here…

Securing The Query

An interesting trend is finally happening in the industry.  Folks are realizing that you must secure those mixed and mashed queries.  At the recent Semantic Technology 2010 Conference, there was a panel discussion on security in the semantic world that specifically dealt with dynamically applying access controls.  These technologies allow people to slice and dice the DataSpaces based on proper access control and the meaning of the data streams.  For a great white paper and overview, please visit this link.

The ability to secure specific person-meaning queries in an of itself will usher in completely new monetization models for data.  I like to call this the Reeses Peanut Butter Cup model.  You have chocolate.  You have peanut butter.  Put them together, and you have something really good.  Remember, data does not have calories just consumes bandwidth, disc space, and compute resources  – depending on usage – your mileage may vary.

Until then,

Go Big Or Go Home!

The Origami of Instinct

We, as a society, are losing our instincts. I believe that instinct plays a role in successful startups. In a startup, it is imperative and essential you possess a passion for being the strongest, fastest, smartest or combination of all of the aforementioned with respect to the technology or business model you are creating. There are no gray areas concerning creating a successful agile business. You cannot pass off the blame or pass the buck in this environment. This is one reason that so many come back to the startup world. This is survival of the fittest. Yet some do not view it this way. They view a method of tearing down instead of raising everyone’s performance level.

Playing to Success; Playing to Failure

Some play to success or play to failure. Playing to succeed is the process that I prefer and honestly only know how to perform within the context of my career. Playing to success means that you involve as many people as possible in the discussions to make the pie larger for everyone. You also involve people that are BETTER than you.   I like to term this “Wolves Drinking Out of the Same Pond.” If all the wolves drink from the pond and protect the pond, it will create a more extensive and healthier pond for all the wolves (companies) involved. Playing to success is connecting the Elite of the Elite together. If you look at the world of successful startups, the people involved are usually multi-faceted over-achievers. They are great business types AND world-class rowers; they are world-class programmers AND accomplished musicians. Then most add extracurricular endeavors like marathon running, martial arts, or extreme sports. Most read self-help books or some non-fiction daily; if it is fiction, it usually is something arcane, or they are re-reading, Milton’s Paradise Lost. These people usually seek out others of like kind to aid them. Also, most do not need self-help books, yet they strive to make themselves better at all costs.

So let’s return to the importance of losing our instincts. Our ability to perceive fight or flight situations are becoming less and less. We do not, daily, need to detect these situations or do we? We no longer need to smell the air to detect an adversary. What does that do to our pre-cognitive processes when engaging with an adversary (read competitive company). Is tearing down the new competitive startup advantageous?

On the contrary, startups should ban together. The startups are the wolves. We have the advantage, we have the speed, might and willpower, and passion for creating faster and more focused entities. Exits are Acquisitions. So why wouldn’t we want to increase our ability to detect Fight or Flight? Do you think that Venture Capitalists can’t detect when you are not entirely committed or are unsure of the answer you were just asked? You bet they can detect any fright mechanism. Use at your disposal what led you to arrive at this juncture. To increase and practice your senses (all 6 of them), and I bet it will make you a better entrepreneur. If it doesn’t, at least you will have improved yourself. For those that already know, you already have, and I will see you at the pond.

The Can’t Do Voice

Most will not find this pertinent to making money or startups. However, it is very pertinent. Undo the “can’t do” thoughts in your head. Can’t do’s have no place in a startup.

This is an excerpt from a larger work entitled “A Day at the Zoo” by the now-deceased Dr. C.S. Hyatt

Sa-Shu-Ah – by Crag Jensen

Close your eyes, relax and float downstream.

Hello – I am the man on the Radio
I am the man on your television News Channel
I am the man behind the pulpit and the man who works for the Government
I am the man whom you must believe in because I am the man whom you fear the most.
I am – alas- the man who has been planted inside your head since you were too young to even talk – to walk- to reason or question.
Yes, it’s true – I am the man who never tires of telling you what to do – how to vote – what to eat – what to buy – what to believe in – where to go – when to go – and how to go.
I am the man who sits inside your head and dictates – dictates – dictates. I am, therefore – and in essence – your personal – government approved & church sanctioned – dictator.
I am not you, but I will be all that is left of you at the end of the day because you have long since relinquished the responsibility of being who you really are because it is much, much, much easier to listen to and obey me – rather than to break free of me and actually accomplish something worthwhile in your – so-called life.

Life-life-life
No Don’t waste it
Don’t be a slave to your programming – anymore
When you can open – yes you can open the door.
And Set yourself free.

And though I wish you well – only time and work will tell
Just how you will fare – true success is all too rare.

But whatever you do wherever you go – don’t forget
To keep on undoing yourself (undoing yourself, undoing yourself…)

The Death of Keywords and Domain Names

I was considering the death of keywords and domain names.  Here is what I believe is going to happen over the next year if not sooner:

1) Keywords will be supplanted by relational phrases

2) Domain names will be supplanted by Twitter Usernames

So before your sitting there saying god this guy is a complete dumb ass just roll with it for a second or two.  Finally with the advent of infrastructure on the web becoming stable many of the technologies that were around circa 89′ / 99′ have come back into the fray.  Here is a laundry list in no particular order of importance: Machine Learning, Data Mining, Natural Language Processing, Semantic Intelligence, Predictive Analytics.  Some have parts of others and some are considered umbrellas of other technologies.  The importance of the infrastructure being in place is phenomenal.  We used to have to worry about things like co-location,bandwidth,storage etc and now it is a given.  EC2,S3 and the like have made these worries a non entity and we can get down to the business of creating intelligent computing – finally.  Twitter Trending a (#) hashtag on follow Friday is nothing more than creating your own intelligent content delivery (iCDN) system abliet at this time human powered.  There is the ability to have relational and semantics across those and also fusing the symbolic with the semantic.  Don’t think its possible?  http://www.mashmeup.com  Enter a complete URL and watch what happens.  No Keywords.  No Adwords.  So what does that leave to bid on in the world of search or ads?  What if you didn’t have your precious AdWords to buy or Keywords to put into your SEO?  We have to come up with a different model.  How about a model of barter arbitrage based on the immediate viewership in the moment?  Just In Time Content Creation that can be trended and optimized.  When we used to think about this there was no animal such as device edge caching.  Done.  There was no animal-like Support Vector Machines in real-time.  Done.  We are at a precipice where this is possible and we are seeing it.  There is however the issue of privacy and creating a personalized in moment web world for you that I will discuss on a different bloggatorium.  That said we will start bidding on information in situ and not based on the banal keyword.  I can hear it now.  Keywords were so 2010.

Given we will not longer have keywords to bid on what’s in a name?  Mashable.  Jack.  PerezHilton.  You get the picture.  I am sure most of you have already barnstormed the TwitterVerse much like the heyday of the early days of domain names.  There are still some nuggets out there.  However, just look at the traffic.  It is amazing to think that Twitter was an accident.  I like to call Twitter the Narscisstic Attention Deficit Disorder Network (aka NADD).  There have already been some rumblings on the back channel of how much is the going price for the top 10 slots.  Immediate hits on your page (back to my first point).  How does the Twitter Username semantically relate to what is happening in the moment?  How about changing your Twitter UserName to fit the moment and then bid?  A new Stock Market Model.  The opportunities.

Then again I could be totally wrong.