Big data and analytics – the media is on the band-wagon

There are 3 items in London’s @CityAM paper this morning on big data and analytics.

City A..M. readers are business people, mainly in London, and if like me, on the Tube. City A.M. only write what they think this audience are interesting in reading. And it seems business people like reading about big data and analytics a lot.

On page 22 is a case study on a firm, “The Outside View”, using data and analytics to find prospects. “Using data to drive new sales”. The main thread is about using a very wide range of data, not just internal – LinkedIn for example.

The other is an opinion piece “Why it’s nimble SMEs that are best-positioned to capitalise on the huge benefits of big data”. It is mainly about the lower cost of managing big data – the cloud etc.

On-line there is another “The big data toolkit” by @jacquitaylorfb

If you are pushing data and analytics to your organisation these might make good PR for you.

Why big data matters for accountants – a good read

In todays London CityAM paper, page 25, there is a good article: -> “Why big data matters for accountants”.

Some main points for me:

  • When accountants say something is a gold rush you have to believe it.
  • “The ability to link data sets is creating new insights” is in the second paragraph. So event accountants agree that data design is a first order issue.

However, there is no example use case given. Some come to mind, but I would be very interested in your ideas. What are the big data use cases for accountants? Please leave a comment.

Why big data needs models more than most

The dirty secrete of Big Data is exposed in this very good posting. Forget the Algorithms and Start Cleaning Your Data.

Failure of big data projects is not in technology but the ability to wire the data together. The lack of success is driven by:

  1. Poor data quality and or inadequate data error handing
  2. Incompatible or poorly understood semantics from different data sources
  3. Complex matching rules between data sources

The blog suggests that the big data tooling therefore needs to focus on the burden of integrating, cleaning, and transforming the data prior to analysis. Example: RapidMiner has 1,250 algorithms for this purpose. That might be good, but also very complex for the average human.

Sounds like a classic case of the need for separation of concerns, right? Untangling and designing solutions to these first order problems is data modeling. Given big data’s fluid data structures that means datapoint modelling. Solutioning with 1,250 data manipulation algorithms, Hadoop, algorithms and huge databases etc can then be based on visible logic and good design. With the alternative, jumping right into build, best of luck!

Capgemini’s testing offering is model driven

The Capgemini testing offering to the banking sector can be seen here

My interpretation is that they have packaged some existing model driven offerings into something their consultants can use on site.


  • Hard coded: The domains they cover include payments and credit cards. The model driven solutions they use look like branded offerings from other vendors. Their solution does not look like it can drive model driven testing into other domains.
  • Scalable: They like model driven approaches as new knowledge and know how can be built into the models as they go. It then deploys to other clients for free.
  • Robust: By putting the capability into the tooling the senior management can be more confident that their consultants on the ground will deliver a good job.

Maybe I should start up a conversation with them. Our modelDT tool delivers these advantages in the general case.

The disadvantage of the not being hard coded is the set up for your business domain. But then the scope is everything, not just what others have done already.

ModelDT: how to industrialize testing

Just posted to SlideShare the slides presented at NoMagic UML conference.

“In this presentation you will learn steps towards making your testing: – Correct – Scalable – Agile – Low cost”