Data: What Was / Will Be Always Lost

When we look at humans and their progress, it is the story of information which was condensed to become human-knowledge. It is easy to overlook the tools invented along the way to preserve and retain this knowledge easily. We have gone from writing letters on sand and mud tablets, to storing entire libraries within a grain of sand. Each one of these tools of storing information has its benefits and drawbacks – the latter is often neglected only until the next new medium and method becomes available.

Atleast for the things in the physical realm – paper, sculptures, carvings have a longer chance of surviving if they find the right environment. Even in its degradation – physical data, I feel, is more forgiving. There is real effort involved in destroying information represented in a tangible medium. More time, if not less, is also required to create this information in the first place. It is possible to salvage from physical ruins which could only be created by thousands over decades, to see traces of drawings that once a playful child scratched on a rock, to find an act embedded on a path or the foundation of a house as a footprint, or smoothened by wear – all of these convey some part of the information about that place and time. The physical world provides a greater range of media to leave information into – and it is always available and being written on.

Language is that offshoot of the physical realm towards abstraction, which is the most familiar to us. Abstraction offers speed, efficiency and fluidity, but also relies on more complicated and specific tools to create and unpack it. The tools themselves can be lost, or become unrecognizable. Even within language, one could make a scale from what is the most tangible (writing) to what is the most intangible (speech) – another reason why we might have a greater number of written material available to us (even if we might not interpret it correctly) than spoken languages – this applies especially to the pre-electronic media era. Interestingly, the electronic media could be considered to be another distant cousin of language itself.  This abstract medium offers the most utility in storing the more intangible of our physical experiences (like sound and vision).

Digital data, to me, is fragile. And even though it enamours and offers a great convenience of packing so much information available at the speed of light, it stands on a structure which doesn’t take much to fail. It is not just the issue of data’s validity and accessibility, but also of the methods to retrieve it. Firstly, data in itself can get destroyed because of environmental factors and physical damage to the storage medium – both of these, even at a reduced rate can cause the data to decay to a point of becoming completely unreadable or inaccessible. And even if these factors are being accounted for, there is always the common risk of hack attacks, password lockouts, overwrites and accidental deletions. Not that the equivalents of these risks do not exist for physical media, the intangibility here allows for a greater level of ease for all of these to occur. And then there is also the physical world this intangibility relies on. It takes one mass-hack, power grid failure, a skipped backup, a broken phone, a mysterious account shutdown or just a forgotten password to realize how distanced we might be from this realm than we really think. Just like language, digital data offers an ease to transmit information compared to other physical media, but it can also quickly die out if there is no one left around to speak or understand it – this is the case of obsolete software and code, where we lack the tools and the hardware to even read the data from just a couple of years back.

Another way information is lost, especially at a larger scale of availability, is when it becomes too common. In this scenario, everyone assumes that someone else will have the data saved, but no one really does. This is becoming frequently visible in the digital domain as large networks like the internet begin to age. Closely related to this information loss is the common practice of removal on purpose.

But all of these issues are known, and we are still riding on trust, now attempting to decentralize data storage – which might just be the best possible way for us to retain the information we are generating at an ever increasing rate. Most of us do not think that the most personal of our chats, memories, movements, images, secrets and thoughts are residing on multiple hardddisks across the world, hooked in and being swapped out constantly in a warehouse.

But, I also think of an unknown genius of our times, meeting an untimely death, leaving behind thoughts and ideas locked up in a disk or a cloud account. These will never be revealed to others even when some services allow for retrieval, not many will bother and sifting through a person’s lifetime’s worth of data will atleast take another lifetime. His ideas would have had some fighting chance to be found, if they were written up in a diary and hidden in a hole in the wall.

If you are for maintaining your own or humankind’s legacy, and in preserving it, you will not enjoy the ride that’s ahead – no doubt that the tools and media will definitely improve, but a lot would have already been lost by then and inbetween the increments. If you are concerned about privacy, you can take comfort in the fact that your data will eventually be lost. Someday, this post too would become obsolete enough to be moved to a sole backup disk which is not cared for enough, and all that I have written here will be lost.

Life’s Lessons From Harddisk Crashes

Many years ago, on a winter night, I restarted my laptop amidst a Windows update. The computer did not boot properly again, as something had gotten corrupted (Windows 7), and I did all the wrong things to get it back to a working state. By the time I could hear early morning traffic of delivery vehicles on the road, I realized I had lost most of my data, which was largely entertainment but also some really old photos and portfolio pieces, ten years of my past in photos – zapped out of existence. No amount of self-attempted software data recovery would let me get back those files. I had overwritten my data multiple times on that limited hard-drive in the panic of trying to fix it. It was a decimating feeling and I had learned a lesson, afterwhich I have slowly created and bettered a routine of regular backups. It soothes an unexplained trait of an archivist, which I have, and have only became aware of in the recent years.

A snippet from that night (seems like a UX issue which started it all) –

It happened today, once again, even though I had been preparing myself for this for years. And surprisingly, it happened in the same manner it had the last time. I remember losing tonnes of data back then- most of my music and movie collection – painfully curated and maintained, were lost to chunks that would take hundreds of years to sew together and make sense. I also lost my sketches and work for the portfolio – I have no idea if I will be able to get them back. So, either I am absolutely stupid or the Linux community needs to make the disk partitioning experience more forgiving and less rigid – it seems like it is made in such a way that the user’s actions ultimately lead him to delete/erase a partition only to clutch his hair for the rest of the night.

But I am glad this happened now and at no other time – if I didn’t have the scattered backups at all, things would have been disastrous. At least it got me into reorganizing my backup storage and to deal with duplicates and other terrifyingly mixed up file structures and content – to look at the problems I had been neglecting. I will have to create a system.

Dramatic, but not the first time it had happened either – this was, what I call, the third crash. The ones before had taken out most of the data with them, but that occured at a time when a lesser part of my life was on the computer –  my life’s work and experiences until then, and my storage media, had been limited in size. Within a decade between 2005 to 2015, things had changed for my generation in India. But, I had no clue it would become what it is now.

The initial few years were very iterative and painful as I was very callous, yet constantly occupied with the idea, and also limited on resources. I went hard at it, probably to compensate for what had happened the last time. My friends also noticed this and they pointed it out, often mockingly. I realized that this was going too far and it did not have to become the centerpiece of my existence. The archivist tries to preserve, but with a deeper understanding that some or all of what he preserves will be lost – all I could do was to not add any noise to it.

Now that I have a system in place (though far from perfect), I do sleep at night without a worry – owning less, consuming and generating less, and accepting that all the information ever created will eventually be lost at some time. The realization prevented this habit from malignizing into an OCD-like issue. Yet, it is still amazing to meet regular people who do not care about all of this at all, and would not even think twice if they lost all their data with their phone (which most of them do – multiple times within a couple of years). They also already have most of their lives on their phones or computers. These people already are playing on chaos’ side by its rules without even knowing about it. I like that. I wish I could be carefree like that, and because I cannot be, I had to figure out a middle-ground. Life is all about surfing on the edge of order and chaos, as many wiser have said before me in different words. I sometimes feel that this was a loser’s compromise. But then again, I do not know of anyone who ever won against entropy.

Trees – A Physical & Living Data Visualization Tool

Looking at the tree in my backyard yet again, I tried to guess its name but unfortunately I am no arborist – for me it is a tree of medium height which blossoms with white flowers in spring and glows bright red before bursting naked into leaves towards the end of fall. Maybe it is the lockdown, but I fully recognize that I am obsessed with this tree. This is a being I look at first thing in the morning everyday, and I take its pictures at random in different weather and seasons. This is much more serious a case than the mango tree I grew up in front of.

I look at trees as beings of a superior intelligence. Describe it however you may but, for me, intelligence is nothing but efficiency. A humanist might say things like, “But we have satellites and nukes, we have the internet making the electrons dance for our entertainment – how can you say that trees are more intelligent than us?” I ask in return, “All of this is impressive, but what is it for?” We are solving these complex problems for nothing but our survival in the most efficient manner possible, and we are far from getting there. That, while these ancient beings have perfected survival in the leanest way, without the extra paths and layers of abstractions, and we don’t even know much about them despite their simplicity. Yes, we can cut and burn down forests at the greatest rates possible but that is no efficiency when it comes to our survival. Also, these beings exist for many human lifetimes with ecosystems they set in place and sustain.

The effects of Gleick’s Chaos were still fresh in my mind, maybe riding the summer along in that mood was what left a deep imprint, but all I can say is that my outlook towards order and disorder will never be the same. I am as much a mathematician as I am an arborist, so I will just write whatever comes to my mind as I glance at this entity which stands slightly off center in my backyard (It stands at a very visually harmonious point and I am certain it has something to do with the golden ratio, but I would not pursue this further). The idea comes from branches of this tree, which grow differently on its different sides, most probably because of the sun’s path. On one end they scatter and spread out at random whereas on the other they rise vertically, clustered together but never crossing each other, almost like bristles of a toothbrush.

The algorithm taking care of this truly must be magnificent, and it would have worked as effectively for the thousand different places where this same tree could have grown. Every morning as I look at this tree while finishing my first cup of coffee of the day, I see new patterns in it no different than the ones left at the bottom of my cup by the grounds in my last careful sip. I think the only other entities which place importance on this tree are the two squirrels that fool about on it.

We can consider trees to be a mathematical function, like all living things, in a dynamic equilibrium. They could, however, be treated as the best examples of a living function since they are static in position all their lives. Once an organism can be held down in space for its entire lifetime, a lot of additional variables can be avoided. I am certain that such a function for a tree would still be simpler than that of an amoeba. Some of the key variables this function consumes are the wind, water, sunlight, gravitational force, soil nutrients and temperature – the basic determinants of the state of the environment and the world, especially the volume specifically local to the tree. The tree is not just a living function which just consumes these variables but it influences these variables in return as well. Without going into the details, one can safely call the tree a physical and live data visualization of all these factors impressed on that given volume of space in the world. Thus, the tree is the best codification and representation of the environmental factors put upon a volume on earth. It is just that we have a limited knowledge about which factor or measurement points to which facet of the environment. I think our study of trees has been just that, trying to find those relations – but hopefully we should not get caught in the relations we make to the intermediary factors instead of the bigger scale phenomenon.

Destin from Smarter Every Day mentions how a tree vibrated form the trunk along two perpendicular planes still shows oscillations at its top-end branches along the same plane – this definitely has something to do with the cancelling effects of the oscillations and their dispersal in the ‘chaotic’ system which the branches are. It must be mentioned that the branches are themselves the result of the wind gusts the tree experiences throughout its lifetime – to know how this oscillating plane of the end branch relates to the average historical wind direction would be an interesting study to undertake.

Trees are at the intersection of the world and the measuring tools that we use to understand and predict it. Or, they can also be considered to be the measuring instruments which we have not yet fully learned to use. They are a physical data representation of complex systems which our supercomputers will never be able to accurately model – and maybe, we never should try to sweat over knowing these complex indeterministic systems at all but just work with the simplified data a tree can provide. I also see the future weather forecast stations to have networks of tree farms being measured and monitored across the world.

Data coded in morphology is long known of, I assume; but looking at the organism itself as the visualization of the external factors might require more exploration – it is like reading the lines on someone’s palm to predict the weather. All we have to do now is to learn how to read nature and its brilliant visualization tools. This is something we, as humans, have done since our very beginning but I feel our tech saturated present has led us astray from that path.