Wednesday, September 4, 2013

The Future of Big Data is Cognitive Big Data Apps

Volume, Velocity, Variety and Veracity of your data, the 4V challenge, has become untamable.  Wait, yet another big data blog?  No, not really.  In this blog, I would like to propose a cognitive app approach that can transform your big-data problems into big opportunities at a fraction of the cost.

Everyone is talking about big data problems but not many are helping us in understanding big data opportunities.  Let's define a big data opportunity in the context of customers because growing customer base, customer satisfaction and customer loyalty is everyone’s business:

  • you have a large, diverse and growing customer base
  • your customers are more mobile and social than ever before
  • you have engaged with your customers where ever they are: web, mobile, social, local
  • you believe that "more data beats better algorithms" and that big data is all data
  • you wish to collect all data - call center records, web logs, social media, customer transactions and more so that
  • you can understand your customers better and how they speak of and rank you in their social networks
  • you can group (segment) your customers to understand their likes and dislikes
  • you can offer (recommend) them the right products at the right time and at the right price
  • you can preempt customer backlash and prevent them for leaving (churn) to competitors and taking their social network with them (negative network effects)
  • all this effort will allow you to forecast sales accurately, run targeted marketing campaigns and cut cost to improve revenues and profitability
  • you wish to do all of this without hiring an army of data analysts, consultants and data scientists
  • and without buying half-dozen or more tools, getting access to several public / social data sets and integrating it all in your architecture
  • and above all, you wish to do it fast and drive changes in real time
  • And most importantly, you wish to rinse and repeat this approach for the foreseeable future
There are hardly any enterprise solutions in the market that can address the challenges listed above.  You have no other choice but to build a custom solution by hiring several consultants and striking separate licenses agreements with public and social data vendors to get a combined lens on public and private data.  This approach will be cost prohibitive for most enterprise customers and as "90% of the IT projects go" will be mired with delays, cost overruns and truck load of heartache. 

The advances in technologies like in-memory databases and graph structures as well as democratization of data science concepts can help in addressing the challenges listed above in a meaningful and cost-effective way.  Intelligent big data apps are the need of the hour.  These apps need to be designed and built from scratch keeping the challenges and technologies such as cognitive computing[1] in mind.  These apps will leave the technology paradigms of 1990s like "data needs to be gathered and modeled (caged) before an app is built" in the dumpster and will achieve the flexibility required from all modern apps to adapt as the underlying data structures and data sources change.  These apps can be deployed right off the shelf with minimum customization and consulting because the app logic will not be anchored to the underlying data-schema and will evolve with changing data and behavior.

The enterprise customers will soon be asking for a suite of such cognitive big data apps for all domain functions so that they can put the big data opportunities to work to run their businesses better than their competitors.  Without dynamic cognitive approach in apps, addressing the 4V challenge will be a nightmare and big data will fail to deliver its promise.

Stay tuned for future blogs on this topic including discussions on a pioneering technology approach.

[1] Cognitive computing is the ability to analyze oceans of data in context with related information and expertise.  Cognitive systems learn from how they’re used and adjust their rules and results dynamically.  Google search engine and knowledge graph technology is predicated upon this approach.  

 This blog has benefited from the infinite wisdom and hard work of my former colleagues Ryan Leask and Harish Butani and that of my current colleagues Sethu M., Jens Doerpmund and Vijay Vijayasankar.

Image courtesy of  MemeGenerator

Sunday, August 25, 2013

Data Science: Definition and Opportunities

Image courtesy of BBC
My thoughts on what data science is, what skills data scientists have, what are the current issues in the Business Intelligence pipeline, how can machine learning automate a part of the BI chain, why and how data science should be democratized and made available to every one including decision makers (business users), how business analyst should build complex data models and how data scientists should be freed up from the mundane tasks of rinse and repeat ETL before building models that provide input for decision making, how companies can build a business practice around data science. 

Key Premise: big data is all data and the big data apps offer the ability to combine all data (public + private) and expand the horizon to discover more meaningful insights.

Data Science is:
  • An art of mining large quantities of data 
  • An art of combining disparate data sources and blending public data with corporate data
  • Forming hypothesis to solve hard problems
  • Building models to solve current problems and provide forecast
  • Anticipate future events (based on historical data) and provide correcting actions (finance, banking, travel, operational runtime)
  • Automating the processes to reduce time to solve future problems
A Data Scientists has following minimum set of core skills:
  • Problem-Solver
  • Creative and can form an hypothesis
  • Is able to program with large quantities of data
  • Can think of bringing data from appropriate data source and can bring and blend data 
  • Stats/math/analytics background to build models and write algorithms 
  • Can quickly develop domain knowledge to understand key factors which influence the performance of a busies problem
Roles Data Scientists play:
  • Problem description 
  • Hypothesis formation
  • Data assembly, ETL and data integration role
  • Model development (pattern recognition or any other model to provide answers) and training
  • Data visualization 
  • AB Testing 
  • Propose solutions and/or new business idea
The balance between human vs. machines:
  • Current: humans play a significant role in the process – ETL, joins, models, visualization, machine-learning and repeating and recycling this process as the problem changes
  • Tomorrow: A big portion of the food-chain can be automated via machine learning so machines can take over and scientists can free up to build more algorithms/models 
  • The process can be automated so repeating/recycling can be cheaper and less time consuming
The Data Science pipeline currently look like:
  • From Data to Insights – this entire process requires mundane skills (IT),  specialized skills (data-scientist) and elements of human psychology to present the right information at right time 
  • The data needs to be discovered, assembled, semantically enriched and anchored to a business logic – this task can be be automated through machine learning (a set of harmonized tools with AI) to free up scarce resources
  • Specialized skills today get addressed by open source technologies such as R and expensive solutions like Matlab and SPSS.
  • Very few software solution carefully introduce human interface to make their application consumable without requiring customer training
This pipeline needs complete rethinking:
  • Automate mundane tasks that IT gets tagged with 
  • Discover data automatically 
  • Detach business logic from data models
  • Make blending public data with corporate data a second nature
  • Free up scientists so that they can build analytics micro-apps for a domain or a sub-domain
  • Data Science need not be a niche (specialized category), it should appeal to the masses (democratization of data and brining insights to everyone without needing specialized skills)
Opportunities in Data Science: 
  • Understand the value chain (IT + Business Analyst + Data Scientists + Business Users)
  • Provide something for everyone  - a single integrated platform (ETL + Data Integration + Predictive modeling + in-memory computing +  storage)  for data-scientist so that they can build standard analytical apps and move away from proprietary models and standardize (helps IT)
  • Analytical apps on this platform (think of them as Rapid Deployment Solutions) for business users
  • Help business analysts write basic models (churn, segmentation, correlation etc.) without needing advanced skills
  • Work with consulting companies so that they can consult and build apps for companies that do not have data scientist on their pay-roll (Mu-Sigma and Opera Solutions)
  • Partner with public data provider (to help clients), consulting companies (Rapid Solutions solution), R/Python/ML communities (mind-share and thought-leadership), 
  • Donate your predictive models to open-source communities

Tuesday, August 13, 2013

Sum of Parts Valuation Analysis for Apple from Needham

Sum of parts is greater than the whole - when one gets an indication that company as a whole is not priced fairly, break the company down into parts and apply sum-of-parts valuation.

The sum-of-parts valuation approach Needham used below to value Apple is my favorite one.  This approach breaks down the organization into parts and applies separate valuation model using different growth assumptions, challenges and steady-state values to arrive to an appropriate LTV.  In this approach, the fastest growing businesses can fetch aggressive growth rates while slower or steady business get more moderate assumptions.

Friday, April 19, 2013

Democratization of Business Analytics Dashboards

I am super impressed with the following visual dashboard from IPL T20 tournament - IPL 2013 in Numbers.  For those of you not so familiar with cricket or IPL, IPL is the biggest, the most extravagant and the most lucrative cricket tournament in the world.  I like the way IPL is bringing sports analytics to the common masses.

What is impressive is that each metric (runs, wickets, or tweets) is live so these numbers get updated automatically, pretty cool for IPL and cricket fans.  Also, each metric is clickable so one can drill down to his or her heart's content.  This is a common roll-up analysis but the visualization and the real time updates make this dashboard pretty appealing.  IPL team, thanks for not putting any dials on this dashboard (LOL).

I have been influencing and now building analytics products that power these sports and various other dashboards/reports for many years.  The most fascinating thing is that these dashboards (or lets call it analytics in general) are reaching the masses like never before.  Everyone has heard of terms like democratization of data and humanization of analytics.  This is it!  The data revolution is underway.  

Now, there are many new frontiers to go after and the existing ones need to be reinvented.  Yes, the analytics market is ready for massive disruption.  This is what keeps me excited about Business Analytics space.

Happy Analyzing and Happy Friday!

Friday, April 5, 2013

Tableau IPO: Let The Gold Rush Begin For Enterprise Software IPOs!

The year 2013 is going to be the year of enterprise software IPOs.  That is not a prediction but well discussed point in Silicon Valley.  Everybody believes that there is a pent-up demand from return hungry investors for the enterprise software IPOs.  Consumer software IPOs have failed to live up to their promise in the last couple of years but the enterprise software IPOs have continued to deliver (examples: WDAY, NOW, SPLK), case-in-point.   

In the last couple of days, two of my favorite companies, Marketo and Tableau have announced plans to go public.  Here are the links to Marketo's S1 and Tableau's S1.  I have had the good fortune to study, evaluate and follow both companies since 2010.  Both the companies have done very well in their respective segments, SaaS marketing automation and on-premise self-serve BI.  They have both exceeded expectations on all fronts (employees, customers, analyst  markets, competitors) after a long hard slog.  

To all my friends, colleagues, investors and readers of this blog, enterprise software is a hard slog, you are in it for a long-haul.  Tableau is a 10-year old company and Marketo is 7 years old (Source:  SEC Filings).

Since Tableau ("DATA") has announced its plan to go IPO this year, I decided to put the striped-down version of my due-diligence, performed in early 2011, on my slide-share account.  Back then, I used relative valuation using QlikView ("QLIK") as a close proxy to put a number on Tableau.  I used PE (earnings multiple) and PS (revenue multiple) of QLIK and assessed a market value of $380million based on Tableau's 2010 revenues of $40 million (from their press release in 2011, this number has been revised down to $34million in S1, huh, strange!)

Now, if one were to use QLIK's current revenue multiple of 5.5 (Source: Yahoo Finance), Tableau could be valued between $700million (based on trailing revenue of $128million) and 1.4billion (based on  $256million in expected revenue for 2013 assuming that they grow their revenue YET AGAIN by 100% in 2013.)

I personally don't think that the street should use QLIK as a proxy instead apply Splunk's ("SPLK") lens to value Tableau.  So using SPLK's multiple of ~19.7 (Source: Yahoo Finance), Tableau will be valued at $2.5billion based on their 2012 revenues.  ServiceNow ("NOW") also has a PS multiple of ~19. 

I have strong reasons to believe that street will be valuing Tableau in this range based on a great growth story till this point and amazing opportunities ahead as we are just starting to drill the BigData mountain.  I will not be surprised to see the valuation range from $2.5billion to $5billion. Amazing!

Tableau's S1
I studied Tableau's S1 filing briefly looking for information on valuation and offering on number of shares.  Not much is disclosed there just yet.  It will likely be disclosed in the subsequent filings as they hit the roadshow to assess the demand from the institutional investors.  Just like Workday, Tableau will also have dual class shares (Class A and Class B) with different voting rights.  The Class A will be offered to investors by converting the Class B shares. 

The last internal valuation of employee options priced the stock at ~$15.  To raise $150million, Tableau will at least be putting 10 million shares of Class A on the block.  Now of course, this will change as the demand starts to build up following their road-show.  One thing is certain that the stock will be definitely priced above $15.  Now, how many points above $15, we will find out in the next few months.  

Let the mad rush begin!!!