Matthew Cowen - Newsletter

Issue 19 : Blockchain ≠ Cryptocurrency
It does have uses in Digital Transformation though
With the announcement of Facebook’s proposal for a blockchain-based currency, I thought it would be a good idea to get in to the subject … It’s time. Time to talk about Blockchain, my apologies.
Here’s a couple of articles to get you up to speed:
- The Ambitious Plan Behind Facebook’s Cryptocurrency, Libra - Wired
- Top Democrat calls for Facebook to halt cryptocurrency plans until Congress investigates - The Verge
On to this week’s issue.
Blockchain and how it is the enabler of cryptocurrencies
If there is a technology that has had more than its fair share of free publicity, it’s Bitcoin and in general Cryptocurrencies. Largely because the ‘value’ of Bitcoin took on another level of interest when it reached nearly 20000$. Can you spot when that happened?

Source: Data from Google Trends/M Cowen
I’m an on-the-record sceptic of Bitcoin and other Cryptocurrencies, and so far, nothing I’ve seen has led me to believe differently. They are almost all, a waste of money. They are all, without exception, a huge waste of energy in a time when economising energy should be a priority not just for governments but individuals alike, and at the very worst end of the scale, some are downright fraudulent. That being said, the underlying technology of these currencies is actually quite interesting and has place for use in Digital Transformation, hence why I’d like to talk about it in this week’s issue. That technology is, of course, blockchain, or as it was originally known as, block chain.
Blockchain is “an open distributed ledger that can record transactions between two parties efficiently and in a verifiable and permanent way”, as describes in the Harvard Business Review article called The Truth About Blockchain. Let’s break those terms down individually.
Open
Blockchain’s openness is based on the fact that no central organisation manages, overseas or controls the blocks, the chain, the transactions and the ultimate lifecycle of a particular blockchain in circulation. This openness is both an advantage and disadvantage depending upon your point of view. The openness allows it to be used freely for just about any application that requires a ledger. Companies have used it for marketing purposes, granting virtual money to users. But a dark side exists too. In fact, virtually all of the online merchants of illegal drugs and stolen medicines, require payment in cryptocurrencies. You’ve no doubt yourself been a victim of spam email suggesting they have sexually compromising information about you and to present that being leaked pay them in Bitcoin (don’t by the way, it’s fake). Cryptocurrencies are believed to be anonymous, but this is doubted and there have been incidents whereby the identity of uses has been uncovered, so beware!
Distributed Ledger
The ledger, is of course, the list into which transactions (debits or credits usually) are recorded. In a small business, this is typically a spreadsheet or basic account application and is not distributed. A ledger can be distributed where the entire database and all its transactions recorded are replicated geographically in a digital format. As stated above, there is no central organisation, administrator or storage. These databases can be permissionless or require permissions to record transactions. In other words, anyone can record transactions or only specified users can. To do this you need to run a node, but I’m getting ahead of myself, more about that later.
Efficiency
I’d personally dispute this part. In my opening statements I said that it is a huge energy consumer and hence by definition is inefficient. That, sadly, is not its only efficiency problem. Blockchain is actually extremely limited in its speed and quantity of transactions and scales poorly. So much so that in 2016 several banks exploring the possibility of using the technology in the personal and business banking sector abandoned the work as blockchain was just too slow. Looking at Bitcoin, as of today it is only pushing 360K transactions per day. Some Supermarkets transact faster than that!
Digging into the why, for Bitcoin the ledger is not open, and each node has the earn the “right” to add a block to the chain. The gain this right, each node has to solve a complex mathematical problem involving extremely large prime numbers, which in turn requires massive computing power, hence energy and cost. In China, whole villages have been setup next to hydro-electric dams to benefit from the lower energy costs compared to a standard electric grid system. Incidentally, the current chances of solving the mathematical problem for Bitcoin is around 1 in 5.8 trillion (read: unlikely). This is the process that is known as mining.
Verifiable and Permanent
Blockchain’s permanence comes from the fact that when changes are entered in one copy — remember its distributed — the other copies are altered in the exact same way simultaneously, making the change permanent (there is no undo function!). This also explains why it is not efficient, because every single transaction has to be recorded, replicated across all nodes immediately. The bigger the database or the change, the slower the replication process. This simultaneous recording allows all transactions to be verified as all copies should be at exactly the same state at any given time.
For completeness, there is some debate of its permanence, and some blockchains have been successfully hacked and currencies pilfered as a result.
Hopefully you have a clearer idea of what the blockchain is and how it is not a currency or virtual money. The technology is simply used to record transactions across the globe so that no single person controls the flow in and out of the ledger, thereby “controlling” the virtual money and potentially its value when converted to cash. Getting money in and out of cryptocurrencies is a whole different debate and one I’m not prepared to get in to at this juncture. If you’ve kept up with me so far, you’ve probably identified areas in which the blockchain technology could be useful for things other than buying MDMA on the Dark Web.
Other uses for Blockchain
So, what other uses for blockchain technology are there, and how does this integrate with Digital Transformation?
Remember the underlying technology is simply to record transactions in an immutable way. This lends itself for a whole raft of uses outside cryptocurrencies, allowing for uses where transaction recording is required or where charging a small fee for each transaction entered in to the ledger is part of the business model. In fact, much disruption in the market is predicted as blockchain has the capacity to upend not only traditional transaction-based business model but also the newcomers like Uber and AirBnB, according to some. These predictions are predicated on the fact that transactions are essentially free and hence cut out the middle-man when transacting between two parties, think driver and rider in the Uber business model, or the renter and owner in the AirBnB business. These are unlikely to come to fruition as easily as that, but you get the idea.
Areas that are more realistic and very interesting to see how they will develop are numerous and varied, with examples in the contract business, financial services (outside of cryptocurrencies), video games, digital book sales, supply chain and others. I’ll take look at a few here, to give you some ideas.
Contracts
Contracts could be administered, enforced and verified totally outside of human interaction using blockchain. This is purely theoretical for the moment as it relies on the use of Smart Contracts that are not widely implemented currently and as such their legal status has not been written in to law in most of the world. On a local basis, a blockchain-based contact between a supplier of services could take advantage without requiring a third party to verify use or abuse.
Financial Services
Aside from the failure of the larger schemes due to inefficiencies discussed earlier, blockchain can be used to do essentially the ledger tasks of smaller projects, speeding up and reducing friction in the banking sector. Backend settlements systems seem to be a target for blockchain compatibility.
Video Games
Modern video games are becoming more and more collaborative, just look at the popularity of Fortnite. Blockchain is said to be investigated to help keep track of digital assets both purchased and earn in massive online collaboration games.
Digital Book Sales
Currently ebook sales are dominated by two players, Amazon and Apple. There is talk of a blockchain implemented ebook store that by definition would bypass those their parties, resulting in possibly cheaper book sales.
Supply Chain
Several efforts and beta projects exist in the supply chain field; BiTA are working on supply chain standards, Everledger are using IBL’s blockchain services in addition to Walmart co-investing with IBM to monitor its supply chain.
Mostly theoretical
The underlying theme that is obvious when reading this issue, is that blockchain is largely theoretical in its uses, with many projects in progress but very few in active use on a mass level apart from cryptocurrencies like Bitcoin, Etherium, etc. It seems that this is unlikely to change in the very near future, largely for the reasons stated above, power consumption and transaction speed. Technology has a habit of overcoming difficulties like these, so it is worth keeping an eye on the advances over the coming years.
Reading List

Plans for world’s largest cryptocurrency IPO shelved
Bitman Technologies (BT) were to IPO (Initial Public Offering) in September in what was expected to be the largest cryptocurrency IPO. However, the fall in the price of Bitcoin put a spanner in the works for BT. BT builds and sells all-in-one kits for cryptocurrency mining and had hoped to raise up to $3bn during the IPO. The company has holdings in both Etherium and Bitcoin and had seen huge losses of value due to the crashes in value since the beginning of 2018 — they are both riding at around 1/6 of their peak value.

Bitcoin's Climate Impact Is Global. The Cures Are Local.
If you want to know more about the world of mining and its environmental effect, read this.

How blockchain technology helps young Caribbean farmers access finance
A good illustration how blockchain can lend itself to initiatives on a small-scale, providing real benefits.
The Future is Digital Newsletter is intended to help you understand Digital Transformation and it’s impact on your business, I encourage you to forward it to people you feel may be interested in the subject matter.
Thanks for being a supporter. Have a great weekend.
→ 10:00 AM, Jun 21

Issue 18 : A step-by-step example of extracting value from unstructured data
Using some basic statistical analysis, you can extract value fairly simply
Bienvenue/Welcome to all the new subscribers. Thank you so much for signing up. Don’t forget you read all the archives here. Let me know if you have any comments.
On to this Issue.
After the last few issues giving you practical advice on your data’s worth, I thought I’d take a different direction and give you a step-by-step tutorial in basic data manipulation, with the aim to extract value from it. It’s a fairly simplistic example but on that shows you what you can do with a little patience and the techniques I ‘m showing here. Bust out your calculators, it’s going to get deep 😉Enjoy.
Gaining insights from basic, unstructured data
For both personal and business reasons, I’ve been journaling for a number of years now but have recently slowed down. I first got into it after reading numerous articles from people I respected in the business world and a general interest for well-built applications that achieve brilliantly their goal. The application that got me started and really piqued my interest, was an app called DayOne. It’s a lovingly designed application that really helps you get your thoughts written down. Being that this newsletter is not a review of the application, I won’t go into how the application works and its features but looking at this review will give a great overview.
To that end, I’d noticed that recently I’d stopped journaling or hadn’t been as regular and rigorous as usual, and I wanted to know why. In the spirit of extracting value from raw unstructured data (call it Big Data if you will), I set out to analyse my journaling from an analytical point of view. As any analysis should be done, I did have in mind a goal. First there were several questions to answer, then an analysis to see if I could “nudge” (see Side Bar - Nudge Theory) myself in to better journaling or at least better regularity.
A quick note, whilst I’m fully aware that this is not specifically “business data”, its serves as an illustration how some simple data can reveal more information than you may have, at first, thought about.
Some of the questions I wanted answered were designed to help me get back on wagon, so to speak. Let’s have a look in detail at each question.
Essentially there are two big questions that required answering:
- When and why have I stopped journaling?
- What could I adjust to incentivise me into more regular journaling journaling?
In this first part, I’m concentrating on the first question as it is the basis on which to answer the second question. To make things legible, I created a MindMap of the full question and sub-question list related to my first interrogation.

The mind-mapped query tree
Phew, that’s a lot of questions for something that on the face of it sounds very simple. In this Issue I will delve in to questions 1 through 4, and dedicate another issue to question 5 and 6, which in analytical terms require more effort, and I’d like this newsletter not to serve as a soporific! Before we dig in to the details, I developed a six-step plan to get me towards answering the question tree and possibly resolving the base issue; how could I tweak things to incite me to journal better and more often?

My Six-step plan
Getting Data
Getting the data is generally a simple process and you should be able to find useful data without too much trouble. My case was no different. I had a journaling app, it had entries, 385 to be precise, so all I had to do was export it. Luckily for me, DayOne features an export function and offers several export formats in which to export. As I was about to manipulate data I chose .txt (Plain Text). It would have been easier in a .csv (Comma Separated Values) format, but it didn’t stop the process. The data was in fact rather oddly structured, partly to be human-readable I guess, but it required some work done to it to get it into a useable state. More on that later.

From a business perspective, many applications will offer an export function in to various formats, but if the application you’re working with doesn’t a quick exchange with support will likely provide the data you’re looking for.
Choosing Analysis tools
Now I had some raw data, albeit in an unusable format for the time being, I set about seeing what data analysis tools would be best suited. Without getting in to a big philosophical discussion, there were two obvious candidates; Microsoft Excel and Microsoft PowerBI. I’ve used both previously and appreciate both systems for different reasons, but in this case, I thought that the easy-to-use data manipulation tools built-in to PowerBI would suit my needs better. In some cases, further statistical analysis might require the use of R to better dig deeper in the data.
Alternatives to PowerBI exist, here’s a list of some:
Tableau (a free to use edition called Public is available too)
Qlik
Zoho Analytics
DataDog
Data Munging
The next step was to do what is called “Data Munging”. The wikipedia definition is:
...the process of transforming and mapping data from one "raw" data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics.
This is also referred to as Data Wrangling and the two terms are interchangeable. Whilst I was not exactly transforming the raw data into another format, I was certainly cleaning the data up and rationalising what to keep. Below is a sample export from DayOne from a journal that collects daily quotes.

The raw data exported from DayOne
Looking at the data, we can see several problems that need to be fixed in order to exploit it. For example, there are no specific delimiters (Tab, Space, Comma, etc.). The date and time are mixed on the same line with “ at “ in the middle and appended to the end is the time zone. Two lines down we see the contents of the journal entry, in this case on one line, but a larger journal entry will span multiple lines. The munging involved cleaning this up.
Using text find/replace tools it was a simple matter of getting rid of “ Date: “, using a replace string of nothing. Performing pretty much the same effort to remove the “ at “. Then finally removing the GMT. I did, however, take advantage to insert a comma + space after the date to create a quasi .csv file to import into Microsoft Excel for the next phase of the munging.

The start of the clean-up process
Importing and Rationalising Data
Importing this in to Excel gives us a formatted file with the date and time separated by columns, with the text data and extra lines occupying the areas in between these dates and times. For the purposes of this exercise I chose to remove all the text data and leave the two columns. The end result, formatted as a table and sorted by date as presented in Excel:

The cleaned-up and structured data, ready for use in PowerBI
Now with this data cleaned up, it was time to start extracting value from the data in the chosen analysis tool, Microsoft PowerBI in this case. PowerBI is a simple and very powerful tool to help you import and work with not only one data source, but multiple simultaneously. Once the data is imported to PowerBI you can create relationships using simple drag and drop tools. For example if you were trying to understand the relationship between Ice Cream sales and the weather, the table with Sales data (including the dates and times) could be ‘joined’ by a relationship to the weather table — data that was imported from your local weather bureau, for example. Mapping sales again weather becomes a trivial matter from thereon. Let’s have a quick look at my simple one-table data and see if I can answer some of the above questions.
Applying Visualisations and Analysis
The first question I wanted to answer was the distribution of entries and if there was consistency in this. The following visualisation is pretty self-explanatory. Clearly, I started in earnest in 2017 and continued in 2018, with a big fall off in 2019. Yes, consistency for 2 years but a drop off recently, even when adjusting for the fact that 2019 is not yet half done. This one histogram answers questions 1 and 3 from the above MindMap (Is there consistency over the years? and Is there a noticeable pattern?).

Count of journal entries by year
In answering question 2 (did I journal more in some months as opposed to others?), the following histograms were created:

Count of journal entries by month (all years)

Count of journal entries by day (all years)
Again, pretty self-evident, I’ve been more consistent in some months compared to others (October, September and December). There are other inferences I can ascertain too. For example, I’m not subject to a rush of New Year’s resolutions, deciding to do something in the New Year then quickly dropping out (Gym membership offers play on this fashion), Tuesdays seem to be the day I journal most often with Wednesday and Thursday coming close in frequency with Monday, Friday, Saturday and Sunday being days I miss more often, I peak in the middle of the week if you will, thus half-answering question 4 (Was I more likely to journal certain days of the week?).
The other half of the answer came from the following histogram:

Split of journal entries by day of the week and the hours at which the entries were made
It’s a little difficult to read, as the data is a little spread out, but essentially we can concur that I write most entries in the morning, between the hours of 5am and 8am - to better understand look at the key in the top left of the image which show the title of day in blocks on 1 hour. Yes, I have written in the journal around 4am! So, I’m a morning person. I already felt this was the case, but to have it all but confirmed by data is interesting. Depending on the type of task at hand this kind of data can help you schedule when to attempt those tasks. For me, the morning is better for more reflective tasks, thinking and writing.
Just to cut the data again into another monthly analysis, the following histogram was created:

Split of journal entries by month and hours of the day of the entries
By month, the one stand-out thing that needs investigating is, why do I have more entries in the afternoons on the months that I made more entries? This, admittedly simple and silly example, does in fact show how simple data can help us make better analysis by revealing things that wouldn’t necessarily be seen when using standard tools like tables in Excel. But… Beware of causation and correlation (see Side Bar - Causation versus Correlation).
Conclusions
The big takeaway I wanted to impart, is the possibility to reveal interesting and useful facts from even the simplest of data sets using modern tools. Digital Transformation calls for us to gather more data and this is one of the reasons why. This long issue only dips in to the possibilities, in fact questions 5 and 6 on the list would be the next logical steps to take. To do I would need to modify the table with columns for the number of entries and the word count of each entry. Measuring, for example, negative or positive type word counts could be used to gauge overall mood in the entries, with the obvious caveats. But you can see how a simple multi-column sheet can provide rich insights. As, I mentioned before, joining other datasets to this model could provide even better analysis (Weather, GPS, etc.) … possibilities!
Side Bar - Nudge Theory
Nudge Theory is, according to this wikipedia article:
Nudge is a concept in behavioral science, political theory and behavioral economics which proposes positive reinforcement and indirect suggestions as ways to influence the behavior and decision making of groups or individuals. Nudging contrasts with other ways to achieve compliance, such as education, legislation or enforcement.
Nudge Theory has its roots in cybernetics and clinical psychotherapy before being more formally and scientifically described sometime after 1995.
Side Bar - Causation versus Correlation
When looking at data it is imperative that you understand the differences between causation and correlation. It cannot be assumed that something “caused” the other thing because it’s correlated, even when the histogram shows it is. Further analysis should be done to try to disprove the cause, and when you can’t you can have a little more knowledge to believe the cause, but not 100%. Conflating the two is a classic error in data analysis and one to be avoided at all costs.
To better understand the differences have a read of this article that picks apart, as one example, a very serious article in the Economist that suggested that Ice Cream consumption was related to IQ, i.e., the more frequently eaten the high the IQ… I’ll let you discover it for yourselves.
Reading List
Metadata is the biggest little problem plaguing the music industry
A great article from The Verge, on the other side of data usage and its potential for harm to copyright holders.
The Future is Digital Newsletter is intended for a single recipient, but I encourage you to forward it to people you feel may be interested in the subject matter. Thanks for being a supporter, have a great day.
→ 10:00 AM, Jun 7

Issue 17 : The Security and Privacy Wars in the Digital Age

It’s all about software
After a couple of practical issues, I thought I’d get back to analysis and strategy — kind of the initial reason for this newsletter 😄 Hope you enjoy it.
Onwards with this issue.
Security and Privacy Wars
A long time ago, in Internet terms, we had a war of security between Unix and Windows. Windows, the new shiny up and coming OS was open by default and that openness enabled all sorts advances, but like anything it can be used for good and bad and that openness allowed a lot of bad things to happen, some that we are still dealing with decades later. Unix, on the other hand, was secure by default, because everything was turned off by default. You were required to open up things as you needed them and were discouraged from opening things unless it was absolutely necessary and could be justified. The evident reality of today, is that the Unix model won out and Windows is currently exponentially more secure as a result of being more cautious. I’m simplifying somewhat, but you get the picture.
Given the fact that history often repeats itself, this battle seems to be taking place in the mobile era. The two protagonists being Apple and Google of course. Google’s approach is akin to Microsoft’s from its burgeoning years, with Apple’s being more like the Unix model — that’s no coincidence, in that macOS is in fact built upon the Unix kernel, Berkley Systems Development. Many tech nerds will cry foul of my assertion, rightly detailing that a kernel isn’t the whole thing. That's fine, it's to illustrate the point.
Apple’s implementation of its developer APIs (Application Programming Interface) shows us that they’re much more concerned with securing and protecting as much as possible from the get-go. Something Google doesn’t seem to be either interested in, or capable of. That’s unfair, Google is perfectly capable of it. Perhaps not with the current implementation of Android, but the engineers at google could easily do so if required. So that leaves us with the only explanation that is sane, that Google has no interest in securing Android like iOS. And that’s fine, it’s their choice. However, I thought it would be an interesting exercise to compare and contrast some areas of Google and iOS to give context to the impending privacy wars that are just gearing up, as you’ll see later.
Security: 2-factor authentication
During a WWDC State of the Union address — the keynote for developers, aimed at developers, rather than the morning’s keynote that is mostly for users and the press — Apple spoke about a number of enhancements for the future of security and privacy. The statistic that around 2 thirds of all Apple ID users have enabled and utilised 2-factor authentication versus around 10% on other platforms was somewhat surprising. Why do users on other platforms not set it up? — because it exists and has done for a while.
Well, it’s a relatively easy question to answer in fact. It’s simply too difficult for most people to understand and setup. Apple makes it a little easier, but again, here many users still don’t use it. The Google’s OS is much more widely used in absolute numbers and is used in places where the security necessity is either less-required or less understood. With many more multiples of users, Google has a hard time getting people on board, but due to its open doors policy in the beginning, getting the updates required to help secure its users is additionally extremely difficult.
Security: OS updates
Google’s OS is fractured to a staggering degree. Just over 10% of Android users are using the latest version, Pie. And even if you wanted to update, the sheer numbers of Android phones in circulation that are below the minimum threshold for updating to more recent and hence more secure versions is a testament to the sheer disregard that Google, the manufacturers and telecommunications companies have for you as a customer.
Apples adoption statistics are unsurprising because Apple does a much better job of updating users’ OSes and supporting legacy devices compared to Google and its OEM hardware partners. In fact, Apple claims that 83% of devices sold in the last four years are running the latest version of iOS, iOS 12. That’s a staggering achievement and doesn’t end there. Not only that, but 80% of ALL devices are running the same version, the last 20% are presumably a mix of those not updated yet and those that cannot. In absolute numbers that’s around 28 million phones around the world.
Android smartphones make up around 85% market share compared to Apple’s 15%, so if we take Apple’s figure of around 900 million iPhones in circulation, that would make a total of over 4 billion Android phones worldwide, all types. If only 10% are on the latest release (40 million), that’s a whole lot of phones that have potentially dangerous security flaws.
Privacy: Data protection (on device encryption)
Although Android supported full disk encryption since Android Gingerbread (version 2.3) released in 2010, however, the implementation leaves much to be desired.
Firstly, it’s an opt-in thing, apart from some of the latest most high-end devices on the market running Lollipop (5.x onwards). Once again, average users are unlikely to take advantage of this somewhat essential feature for your security. Apple’s iPhone OS 3 introduced full disk encryption in 2009, yes, the era of the iPhone 3GS! That it was automatically applied with literally no user interaction meant users were more secure by default.
Security: Password Management
iOS includes a rudimentary built-in password management tool called Keychain that allows the storage of passwords, the recall when needed and a syncing model that surfaces passwords saved to it on all of your Apple devices, save it once and its available everywhere. More recent versions suggest strong passwords when presented with password creation or change dialogs.
Apple hasn’t stopped there, it’s adding many features to macOS too, its less wall-gardened OS. Of note, is a new Notarial Service for apps that are distributed outside of the App Store. It allows Apple to lock-down malicious code before its even distributed. In basic terms, an app that uses the service connects to Apple servers on startup to check it still has a valid pass. If that pass has expired — which could be for a number of reasons — the app is prevented from starting, thus, in theory, protecting your computer and all it holds.
As we get further in to the digital age, it’s privacy that matters##
In May of this year, both Facebook and Google held conferences — Facebook F8 and Google I/O — both Google and Facebook made announcements that hinted at a new direction for some of their products, stating that they’re turning to privacy by default.
Google’s announcement as told in Wired:

Google placed a big emphasis on user privacy in this keynote. It’ll now be easier for users to access their Google security settings from their smartphone and from there quickly delete their web history and app activity. The firm will also process more user data on the device without uploading it to its own servers.

Google Maps is also getting its own minor privacy overhaul, with a new incognito mode that won’t remember search results or dropped pins. This brings Maps in line with Chrome – which has had incognito mode for a decade – and YouTube.

On the security front, Google is making it easier for Android phone owners to verify their identity through two factor authentication. For certain Android smartphones, the phone itself will act as a security key, allowing users to verify their identity with a single press, doing away with the need to receive and input a code.

Facebook did the same, as recounted by BuzzFeed:
Facebook CEO Mark Zuckerberg kicked off his keynote with a privacy-focused speech. "Privacy gives us the freedom to be ourselves ... so it's no surprise that's the fastest way we're communicating online is small groups," Zuckerberg said. "That's why I believe the future is private." Following the comments, the CEO said, jokingly, "I get that a lot of people aren't sure that we're serious about this. I know that we don't have the strongest reputation on privacy right now."
Let’s be clear on one thing, they're still collecting data in every way they can and matching that data up to potential advertisers (including themselves) in order to make money. Their fundamental BM has not changed, they’ve just found new ways of executing on it. What they have done however, is try to change the conversation towards privacy, allegedly in a thinly veiled attempt to divert attention or subvert impending law suits against them from the EU and the DOJ (US).
Listening to episode 244 of the Critical Path podcast by Horace Dediu, I had a similar realisation put forward by Horace; Personal data should be treated like a controlled substance.
In Digital Transformation you're going to be handling data all the time, and as a result you’re going to need to treat it like that controlled substance as described by Horace. Chemicals and arms can be legitimately owned and used but they are controlled (in most civilised countries anyway) to a point that harm done from them is limited. The GDPR in this, its first guise, tries to develop a framework that controls personal data in much the same way; you need to explicitly ask for permission BEFORE collecting it, you need to clearly state the purpose you're asking for it and lastly need to clearly and explicitly detail how you are going to use it.
Hardware platforms and their apparent stance on openness vs closedness, security and privacy, are starting to matter less and less, and it is in the software layer that lies the opportunity and risk. Software is eating the world. The coming European Copyright Directive, recently approved but not yet in law, will expose this to an even greater degree.
Reading List

GDPR After One Year: Costs and Unintended Consequences - Truth on the Market
Here’s a different angle on the usefulness of GDPR, worth the read.
GDPR can be thought of as a privacy “bill of rights.” Many of these new rights have come with unintended consequences. If your account gets hacked, the hacker can use the right of access to get all of your data. The right to be forgotten is in conflict with the public’s right to know a bad actor’s history (and many of them are using the right to memory hole their misdeeds). The right to data portability creates another attack vector for hackers to exploit. And the right to opt-out of data collection creates a free-rider problem where users who opt-in subsidize the privacy of those who opt-out.

Source: Deloitte
Bringing digital to the boardroom

DIGITAL transformation is not just about adopting new technologies. Its significance, especially in the business world, extends to how technology can be used to create—and sustain—a competitive advantage.

As such, digital transformation, along with the potential for disruption, is high on the agenda for executives at many financial institutions, as well as their boards of directors.

Not just financial institutions I’d say, and I’d go even further and suggest that most Caribbean businesses would do well to understand this and heed the advice given in this article.
The Future is Digital Newsletter is intended for a single recipient, but I encourage you to forward it to people you feel may be interested in the subject matter.
Thanks for being a supporter, I wish you an excellent day.

→ 10:00 AM, May 31

Issue 16 : Part 5 - Practical steps towards your Digital Transformation journey

Data collection and turning that data into an asset
Good day, all. This issue is a continuation from Issue 15, I encourage to read it before reading this issue, it’ll help your understanding of this one as it continues on the theme of data and its importance in Digital Transformation.
I recently recorded another podcast with Kadia Francis, the Digital Jamaican. You should check out her blog and podcast series, really good stuff. I’ll let you know when it’s out, but we talked about some of the topics in the last few issues of this newsletter. I had a great time recording and it’s something I’d like to do more of in the future. I’m open for propositions. Stay tuned.
On to the issue.

Source : dailyworth.com
Turning data in to a benefit
In the last issue, I noted that data is the new oil in the digital economy, and showed a few simple strategies for finding and analysing data, where data can emanate from and finally a couple of simple tools to use in linking data sets between business applications to break down the silos of data you have in your applications. What I didn’t get in to, is the finer details of those data and what is important when you are collecting and analysing them.
Big Data, you’ve all heard the term, with many definitions as to what exactly Big Data is. I don’t know actually know if there is the “definitive” definition or not, but it certainly doesn’t mean large quantities. This is a myth, and one that persists. Sure, Big Data can be huge in size, and there are data sets being used in everyday activities like weather pattern prediction, that are absolutely gigantic in size, but it’s more pertinent definition hinges on the fact that it is unstructured.
Remember the definition of unstructured data from the last issue:
Unstructured data represents any data that does not have a recognizable structure. It is unorganized and raw and can be non-textual or textual. Unstructured data also may be identified as loosely structured data, wherein the data sources include a structure, but not all data in a data set follow the same structure.
Additionally, Big Data usage and cost has become perfectly aligned with cloud infrastructure, particularly in the database space. Previously companies would have had to invest massively in database infrastructure and data processing infrastructure, mostly locally stored with all the overheads that that requires, cooling, electricity, backup infrastructure, fire safety and large personnel costs. These were often huge investments leaving them available to only the (rich) few. In today’s paradigm, cloud database infrastructure is not only cheap, but offers the possibility to any organisation to use all but the most powerful compute systems in existence — the most powerful being reserved for education, research and military purposes mostly. And remember cloud infrastructure has a plasticity that on-premise infrastructure doesn’t — need 10 processors now but at peak volume, 100 are needed. No problem in the cloud world. Not only that, but analytics add-ons for those cloud databases are available directly from the database supplier and increasingly from other suppliers that have USPs that go above and beyond those of the database vendor. We’ll take a look at some of these in future issues. The thing to remember nowadays is that big data doesn’t necessarily have to come with big costs. And, in some cases you don’t even have to pay a penny to store it. I highlighted this in the last issue but didn’t explicitly mention it in this context; but places like data.gov (US), Facebook, Twitter and many others, do all the back-office stuff for you.
Other data that is unstructured and being generated at a dizzying rate is data the records or tags things with the longitude and latitude, location data. All modern smartphones record location data and many cameras do similarly when taking photos. Clearly the use cases for location data are endless and are sometimes used in surprising fashions, and not necessarily nefariously. If we ourselves recorded our location on an ongoing basis, it would probably provide insights into our own behaviours that could help us with issues as diverse as health — imagine if you discovered you traversed daily a particularly polluted zone and the you have been doing this for years, correlating that with your asthma incidents — to things as mundane as optimising journeys for the best fuel usage.
I briefly mentioned in the last issue that the type of sensors that have been used since the early 2000s are providing “…a near constant avalanche of data…”, in what has now become known as the Internet of Things, or IoT. IoT is no being deployed right throughout the entire value chain in highly digital businesses, and this influx of data is providing insights that were previously inconceivable. At a large Microsoft Conference, called Inspire, I remember a couple of years ago a discussion and demonstration of the power of data collection for ThyssenKrupp:
ThyssenKrupp Elevator wanted to gain a competitive edge by focusing on what matters most to their customers: reliability. Drawing on the potential of the Internet of Things (IoT) and Microsoft technologies by connecting their elevators to the cloud, gathering data from their sensors and systems, and transforming that data into valuable business intelligence, ThyssenKrupp is vastly improving operations—and offering something their competitors do not: predictive and even preemptive maintenance.
Read that statement again, …something their competitors do not (currently) have. Predictive and preemptive maintenance. In other words, allowing ThyssenKrupp to schedule preventive maintenance and even predict lift failures before they happened. If you can get past the obvious advert for Microsoft, this quick video gives you a great idea of what I’m describing about data as an asset. Again, the possibilities are endless for data collection and data analysis. Microsoft has even gone further, and christened the terms Intelligent Cloud and Intelligent Edge and defined them as:

The intelligent cloud is ubiquitous computing, enabled by the public cloud and artificial intelligence (AI) technology, for every type of intelligent application and system you can envision.

The intelligent edge is a continually expanding set of connected systems and devices that gather and analyze data—close to your users, the data, or both. Users get real-time insights and experiences, delivered by highly responsive and contextually aware apps.

Source: microsoft.com
But all this data collection and data storage isn’t and couldn’t be anything useful if it isn’t shaped and presented to provide insights, or what has become known as over the last 20 years, Business Intelligence (BI). I alluded to this earlier in this issue, but advanced BI tools and a few simple skills in Data Science have become a hot topic for most businesses in their Digital Transformation journey. Take a look, for example, how many jobs and how well-paid Data Scientist roles are currently. There is a real scarcity of good data analysts in business roles.
Rather fortuitously for me, this last Wednesday, the NY Times published a long but fascinating article called How Data (and Some Breathtaking Soccer) Brought Liverpool to the Cusp of Glory about the use and subsequent valorisation of data to help Liverpool FC out of the blues of the last 10 years or so. Transforming them into one of the top teams to beat in Europe, with some even saying that this might be the start of a new era for Liverpool FC, like its previous run of form from 1975 to 1990.

For four years, from 2008 to 2012, Graham advised Tottenham. The club was run by a series of managers who had little interest in his suggestions, which would have been true of nearly all the soccer managers at that time. Then Fenway bought Liverpool and began implementing its culture. That included hiring Graham to build a version of its baseball team’s research department. The reaction, almost uniformly, was scorn. “ ‘Laptop guys,’ ‘Don’t know the game’ — you’d hear that until just a few months ago,” says Barry Hunter, who runs Liverpool’s scouting department. “The ‘Moneyball’ thing was thrown at us a lot.”

Graham hardly noticed. He was immersed in his search for inefficiencies — finding players, some hidden in plain sight, who were undervalued. One afternoon last winter, he pulled up some charts on his laptop and projected them on a screen. The charts contained statistics such as total goals, goals scored per minute and chances created, along with expected goals. I was surprised to see Graham working with such statistics, which he had described to me as simplistic. But he was making a point. “Sometimes you don’t have to look much further than that,” he said.

And that’s the point. Some simple statistics may help you better understand a taxing issue and hence aid in your resolution of the problem. Thoroughly recommended.
The risks of data collection
I think it remis of me if I didn’t at least talk briefly about the dangers of data collection. And I’m sure you don’t need me to tell you that holding a lot of data, and more specifically, personally identifiable data, is a risk that you need to assess in your business. In fact, this is the real reason for the General Data Protection Regulation ’s (GDPR) being.
If we look at the intention of the GDPR, it’s that data should be treated like a controlled substance. A controlled substance being something like weapons, drugs and so forth. Clearly the general public shouldn’t be purchasing, storing, using and selling on these types of substances and products. The GDPR makes us look at personally identifiable data in the same way; we shouldn’t need to collect, store, process and/or sell-on these data. Unless…
Where the similarity continues with controlled substances, arms and drugs can actually be legally bought and sold, but under (in most civilised countries at least) strict controls. GDPR, like the controller substance trades, ensure that we justify why we need the data, for what purpose it’s going to be used in explicit terms and a promise that it won’t be used for other purposes. Additionally, GDPR prevents the resale of that data, again, unless there is specific consent from the affected users.
When you develop your data collection strategies, you need to think quite carefully about the data you need and if it is classified as personally identifiable data and plan accordingly, from consent forms, operations structures to ensure that it isn’t used outside the initial scope and that it is secured accordingly and that you can provide proof of its ultimate destruction. Think in terms of the entire lifecycle of the data; creation, collection, processing, transformation, encryption, depersonalisation, restitution, backup and destruction.
Remember, according to GDPR you are responsible for proving this. I’d like to go further in to this subject in the future, especially since we’ve had numerous examples of data care failures from the likes of Facebook and Cambridge Analytica, and I’m currently researching the subject. Soon.
Reading List

Source: StockSnap (Pixabay)/ict-pulse.com
ICTP 056: Building Caribbean-relevant software applications, with the team from Rovika
A talk with Dennison Daley and Manish Valechha from Rovika, a software house based in Montserrat. They have developed apps for Montserrat and BVI governments and have potential for wider-ranging projects throughout the Caribbean.

Source: E-Estonia
20th William G. Demas Memorial Lecture to focus on digitisation
If you weren’t aware, Estonia has been on a national Digital Transformation journey for a number of years now. Estonia even offers full digital citizenship , or e-Residency. Estonia was the first in the world to achieve this. Starting 1997, Estonia implemented their e-Governance strategy and have continually innovated and developed new digital services for their residents, both physical and e-Citizens. They have fully embraced technologies like Blockchain — in 2008 in fact! — and have ambitious plans for future services like intelligent transportation.
In this, the 20th William G. Demas Memorial Lecture, Calum Cameron of Estonia is to speak about “Transforming to a Digital Society: Principles and Challenges”. Unable to be there in person, I’m hoping I can get a feed or at least a video of the speech. If you have any information to help me, let me know please. Thank you.

Source: epthinktank.eu
Europe, a unique partner for Caribbean
I alluded to some of this in the podcast with Kadia, but this essay gives a good overview of how Europe can help the Caribbean and why it’s important for the Caribbean to work together facing increasing competition from around the world. Definitely worth reading even if it’s a little on the propaganda side from a High Representative of the Union for Foreign and Security Policy.
The Future is Digital Newsletter is intended for a single recipient, but I encourage you to forward it to people you feel may be interested in the subject matter.
Thanks for being a supporter, have a great day.

→ 10:00 AM, May 24

Issue 15 : Part 4 - Practical steps towards your Digital Transformation journey

Data, data, data
Good day everyone. As promised, back to a practical issue, this time less generalist and more centred on purely digital initiatives. Hope you enjoy it, let me know, don’t be shy.
On to this issue.
Data is the new oil
If nothing else, Digital Transformation is inseparable from data, not just any data, but good data and data that can be exploited in the short term as well as the long term.
Data is the new oil in the digital economy
Although there is some dispute in the reality of the phrase, with some reasoning that it’s not, the phrase holds true for many businesses looking towards digital transformation. Businesses produce data all the time, but it is mostly lost, stored but not accessed or downright under-exploited. We are data-rich but analysis-poor, and it’s to our detriment. On the other extreme, data is not God, so some restraint and sensible treatment is necessary. I’m getting ahead of myself, so let us look at data and how we can, firstly, identify data to capture then develop data capture strategies.
Where is the data coming from?
Before digital systems were the norm, captured data was organised and designed in a systematic manner, with businesses specifically thinking about what it was they wanted to retain. Investments in the tools necessary to measure data were onerous and notoriously unreliable.
First Building Management Systems (BMS) developed by the likes of Sauter and Johnson Controls amongst others, were simple systems based on Programmable Logic Controllers or PLCs. A far-off thermostat would send (infrequently often) a signal to the central until that would apply a simple logic then actuate the command on another basic module to apply the remedial action. High temperature in the main meeting room due to human activity would trip the thermostat to send a signal to the central until which would in turn look at the PLC code to send the start fans and send cool air to the room (I’ve highly simplified it here, but you get the picture):
IF HIGH TEMP THEN OPEN COLDWATER VALVE AND START COOLING FANS
In more operations roles, data would be extracted from systems such as the Payroll and in some cases specific market surveys were published to generate data, but nothing like the amount of data we produce in the modern world. In fact, data is being produced not only in vastly larger quantities, but that data is being generated automatically whether we want it or not.
The applications, devices and sensors that have been deployed since the early 2000s, there is a near constant avalanche of data being produced on anything from the length of time you slept to detailed nuances of the movement of people in a particular corridor of a shopping mall. Social Media additionally generates tons and tons of information about us and our surroundings. The whole supply chain from development to the final death of a product is sourcing data all the time.
The indications are that this rate of increase is not slowing but increasing. What was benign data may be turned in to extremely valuable data in a short period of time. The Facebooks and Googles of the world know this and are making it easier to capture and analyse data internally and making much of their tools available for the general public.
The difference between then and now
The data projects of the past were designed and implemented to strict rules and guidelines and the data produced was subsequently structured and used to fulfil basic objectives. To give you an idea of exactly what structured data is let’s first look at its definition according to techopedia.com:
Data conform to fixed fields. That means those utilizing data can anticipate having fixed-length pieces of information and consistent models in order to process that information.
We can see that structured data has specific attributes that let us or our computers know beforehand how that data will appear, allowing us to apply processing easily. Examples of structured data are elements such as the information on your Passport; Name, Age, Place of Birth, Expiry Date and other basic simple fixed data types.
Modern data, on the other hand, is unstructured and is defined by techopedia.com as:
Unstructured data represents any data that does not have a recognizable structure. It is unorganized and raw and can be non-textual or textual. Unstructured data also may be identified as loosely structured data, wherein the data sources include a structure, but not all data in a data set follow the same structure.
Examples of unstructured include the data generated by corporate Customer Relationship Management (CRM) systems or the previously mentioned Social Media applications like LinkedIn, Twitter and Facebook. Although the data generated is classified (name, amount spent, next call-back date etc) the data is loosely tagged rather than forced in to specific lines and columns in a spreadsheet. In fact many modern systems deliberately unstructured data to push the boundaries of what can be extrapolated with these datasets, sometimes producing surprising results.
Data capture strategies
Let’s first look at some of the places you can find data relevant to your business. There are many sources and clearly, I can’t list them all and quite frankly I don’t know of all of them either. But there are some basic examples to help you get started on your own data research.
Google Trends
Google Trends is probably one of the most well-known. It’s a free service to all comers, and you don’t even need a Google account to benefit from its information. Simple to use, the interface is about as Google as you can get. Type a word or two and Google Trends will tell you how they are trending in Internet search over the last several years. Google Trends ranks its data on a scale from 0 to 100. 100 being peak interest on the subject and 0 either meaning 0 interest or no data was available. In the following example, I searched Digital Transformation and set the time span to “2004 - present”. This was the result:

Source : Google Trends
Digital Transformation is at peak interest currently, which is logical, but didn’t start becoming of interest until around 2014. That in and of itself is an interesting data point, Digital Transformation is a fairly new phenomenon. Additionally, you can pit search term against search term to gain insights in their comparative interest. Useful to judge interest in one type of product versus another. Being that this is not a lesson in how to use Google Trends, I won’t go into too much detail, but you should know that the data can be displayed by region and you are automatically suggested related queries, again giving you an idea of the things people are searching for.
LinkedIn
Being the professional network with over 500 million users, LinkedIn holds an enormous amount of data on companies and employees. Whist there is no direct simple to use interface aside from LinkedIn itself, tools are available to extract data directly from LinkedIn for manipulation in other systems. One such example is LinkedIn Connection Analytics Dashboard from Vishal Pawar, Chief Architect at Aptude. This uses a PowerBI template to analyse the data exported from LinkedIn. There’s plenty more information from the link.

Source : https://www.linkedin.com/pulse/brand-new-perform-free-linkedin-connection-analytics-vishal-pawar/
Don’t forget good old data exports from legacy applications. These data can be useful when integrated in an analytics application like PowerBI or Google Analytics, even some basic statistical analysis in tools like Microsoft Excel can provide useful information.
Cloud Applications or Software as a Service (SaaS)
The value of cloud applications goes beyond the initial promises of yesterday. Most SaaS applications like Office 365 and Google Apps were sold on the promise that they entailed no upfront costs and offloaded the day-to-day operations management freeing the client to concentrate on the real value-added aspects of the business. Whilst this is somewhat true, it is incredibly short-sighted. The ‘real’ value of cloud-based apps is their data generation that can be collected, integrated, joined and exploited by all sorts of systems providing value that is greater than the sum of its parts.
Take for example a small business that has fairly standard needs for operations software, accounting, CRM and project planning. In the past the business would purchase a dedicated accounting application like Sage. A dedicated CRM, Salesforce for example and probably use Microsoft Project. These applications created silos that largely prevented the use of information between the systems. In actual fact, a very well remunerated job in the pas was that of a data integrator who had skills to “join” systems in a very basic manner to try to extract value of multiple systems rather than individual ones.
Today, my advice for a small business would be to go completely SaaS but choose the systems that offer data integration APIs or connections. Modern SaaS applications often allow linkage to platforms such as Office 365, CRM software like Hubspot, through mailing list management software (did anyone say MailChimp?) right through to accounting and billing. A client created in the CRM as a prospect should automatically appear in the project management system and be created as a client in the accounting software regardless if it has purchased anything or not. The value generated understanding through the different systems when, why, how, how much etc has enormous potential.
Linking these data sources usually happens through one of two paths, either directly by the applications interface where data connections are exposed directly, i.e., from one application you log in to the other application and grant access to data. The second path is a little more long-winded, but often achieves the same thing and can often provide even more capabilities. There are three well-known systems tax fit this description. Let’s have a quick look at each one.
Microsoft Flow
For a business with significant investment in Microsoft and particularly Office 365 and its components, the obvious choice is Microsoft’s own workflow management tool, Flow. It allows you to turn multi-step repetitive processes in to true workflows rich in data. Not limited to only Microsoft applications, Flow allows links to other SaaS platforms like MailChimp, Facebook, Google and Slack. Microsoft Flow is free to use with options for Premium connections and workflows.
IFTTT
If This Then That is considered the granddaddy of the initial wave of workflow applications and is still one of the most popular. Its real value is in the consumer SaaS space where it allows, like Flow, the linking together of hundreds of different platforms controlling lights to sending emails when there’s a tweet with a particular keyword. IFTTT is free to use.
Zapier
Zapier is probably the most sophisticated and consequently the most complicated to use of the three, but don’t let that discourage you. It’s the most accomplished and reliable workflow application I’ve used. I personally use it to link data between my CRM, Time tracking system, accounting software and various social networks.
As you can see, data is all around us and reasonably easily exploitable with a little help from tools like those I’ve highlighted here. If you want some help with your own needs, get in touch I’d be happy to help.
In the next part I’ll look at more data collection and introduce notions of simple data usage with powerful analytical tools. I look forward to publishing the next issue.
Recommended Reading

Measure what Matters, whilst not strictly about data collection and analysis, is a good book to get you thinking about the right type of data collection strategies with achievable outcomes (Key Results). It has a forward from Larry Page, one of the cofounders of Google who starts off by saying:
I wish I had this book nineteen years ago, when we founded Google.
Reading List
Will Artificial Intelligence Enhance or Hack Humanity? - Wired

A really interesting interview with Yuval Noah Harari, Fei-Fei Li by Nicholas Thompson of Wired.
The Future is Digital Newsletter is intended for a single recipient, but I encourage you to forward it to people you feel may be interested. The more the merrier.
Thanks for being a supporter, have a great day.

→ 10:00 AM, May 17