A REVIEW OF ‘THE DATA JOURNALISM HANDBOOK: HOW JOURNALISTS CAN USE DATA TO IMPROVE THE NEWS’ BY GROUP SIX (6)

                         ABOUT THE AUTHORS
  The book under review is chiefly the work of numerous contributors (71), collated and edited by Jonathan Gray, Liliana Bounegru and Lucy Chambers while the graphic illustrations are by Kate Hudson.
Gray is currently the head of Community and Culture at Open Knowledge Foundation and experienced in several data journalism projects. Bounegru works at European Journalism Centre as a project manager and is equally an editor at an online data journalism resource website, Datadrivenjournalism.net. Chambers works at Open Knowledge Foundation as well as a Community Coordinator. She is well-versed in data journalism activities and training journalists in that line.

         ABOUT THE BOOK
‘The Data Journalism Handbook: How Journalists Can Use Data To Improve the News’ consists of 183 pages, comprising of six (6) chapters, all dedicated to in-depth discussion of various topics concerning data journalism.
The book is published by O’Reilly Media, California in the United States of America, in the year 2012.
                   INTRODUCTION
‘The Data Journalism Handbook: How Journalists can Use Data to Improve the News’ is the work of a wide selection of accomplished journalists which chronicles their experience in the emerging field of data journalism and the indispensable nature of this emerging field to journalism as a whole.
In this book, the contributors attempt to pinpoint what the field of data journalism is while emphasizing the need for data in journalism. The field as the book reveals, involves mining data for news and putting that news in context for the audience such that the data does not remain a series of digits and facts anymore. Instead, data journalism interprets data and brings it home for the audience to understand and appreciate the impact of the data on their daily lives.
The contributors emphasize that hackers are indispensable to data-driven journalism, showing that the field is symbiotic between computer experts and journalists. In addition, data journalism facilitates the watchdog function of journalists and is equally a viable economic tool, capable of raking in huge profits and establishing media organizations as indispensable and credible sources of information.
             CHAPTER ONE - INTRODUCTION
The contributors stated that data journalism is more than journalism done with data. The world we live in is digital now as almost everything and anything is described and represented with binary numbers, from photos, videos, and audios.
What distinguishes data journalism from other types of journalism is its new possibilities of combining the traditional ‘nose for news’ and the ability to tell a compelling story with a range of digital information. For example, using software to find connections between hundreds of thousands of documents as The Telegraph did with MP’s expenses [http://tgr.ph/mps-expenses].
Data journalism can help a journalist tell a complex story through engaging information graphics or it can help explain how a story relates to an individual. Data can be the source of data journalism or it can be the tool with which the story is told or it can be both. As like every other source, the contributors suggest that data should be treated with skepticism and like any tool, we should be conscious of how it can shape and restrict the stories that are created with it.
Why journalists should use data
  Data journalism and analysis can reveal ‘a story’s shape’ (Sarah Cohen] or provides us with a ‘new camera” (David Mccandles] in which to view the world. Data usage has shifted the job of journalists from being the first ones to report to being the ones telling us what a certain development might actually mean. Journalists are encouraged to use data to transform something abstract into something everyone can understand and relate to.
Data can be used by journalists to create personalized calculations or help the layman see possible solutions to complex problems. Even in the profession of journalism, journalists will guess less, look for quotes less and instead, build a strong position supported by data which no one will be able to refute.
Furthermore, getting into data journalism offers a future perspective. Data journalists or data scientists are already a sought after, not only in the media but companies and institutions around the world.  They are seen as ‘sense makers’ and professionals who know how to dig through data and of course transform it into something tangible.

Why is data journalism important?
Some leading journalism practitioners state why they think data journalism is important:

  1. Philips Meyers, Professor  at Emeritus University of North Carolina says for filtering the flow of data.
  2. Aron Philofer, New York Times says it gives new approaches to storytelling.
  3. Jerry Vermanen is of the view that data journalism updates one’s skill set.
  4. It is a remedy for information asymmetry [the inability to take in and process with speed and volume that come to us] Tom Fries, Bertelsmann Foundation.
  5. Tim Berners-Lee, founder of the World Wide Web says data journalism is the future.
  6. Sarah Slobin, Wall Street Journal- A way to tell richer stories
  7. It is an essential part of the journalist’s toolkit –cynthia O murchu, financial times.
  8. A way to save time –pedro markun, transparencia hacker
  9. Prouding independent interpretations of official information – Isao Matsunami, Tokyo.
  10. An answer to data driven PR – Nicolas Kayser
  11. A way to see things you might not otherwise see – Cheryl Phillips

Data journalism in perspective
Care was taken in this handbook to emphasize that what we now call data journalism is an improvement of computer- assisted reporting (CAR) which is the first organized systematic approach to using computers to collect and analyze data to improve news. Since the 1960s (mainly investigative, mainly U.S- based) journalists have sought to independently monitor power by analyzing databases of public records and scientific methods also known as ‘ public service journalism’ advocates of these computer assisted techniques have sought to reveal tends, debunk popular knowledge and reveal injustice perpetrated by public authorities and private corporations.
Precision journalism was envisioned to be practical in mainstream media institutions by professionals trained in journalism and social sciences; it was born to respond to a form of journalism in which fiction techniques are applied to reporting. It can be understood as a reaction to some of journalism’s commonly cited inadequacies and weaknesses, dependence on press releases, bias towards authorities’ sources and so on.
Data journalism and computer –assisted reporting
At the moment, there is a debate of continuity and change going on around the label data journalism and its relationship with previous journalistic practices that employ computational techniques to analyze datasets.
Some, as highlighted in the book, argue that there is a difference between CAR and data journalism. They say CAR is a technique for gathering and analyzing data to enhance reportage whereas data journalism pays more attention to the way that data sits within journalistic flow as a whole. Meanwhile, rather than debating whether or not data journalism is completely novel, a better idea would be to consider it as part of a longer tradition, responding to new conditions and circumstances.
Data journalism is about mass data literacy
Data journalism is a part in the system of practices that have developed around data sites and services. Digital technologies and the web are fundamentally changing the way information is published. Quoting and sharing source materials is in the nature of the hyperlink structure of the web and the way we are used to sharing information today. By enabling anyone to drill down into data sources and find information that is relevant to them, as well as verify ascertains and challenges commonly received, data journalism effectively represents the mass democratizations of resources, tools, techniques and methodologies that were previously used by specialists. While currently quoting and linking to data sources is particular to data journalism,, we are moving into or towards a world in which data is seamlessly integrated into the fabric of media. Data journalists have an important role in helping to lower the barriers to understanding and delving into data and increasing the data literacy of their readers on a very large o mass scale.
          CHAPTER TWO – IN THE NEWSROOM
In this chapter we see how data journalism sits within news rooms around the world, and how it is uniquely adopted to fit into each of these newsrooms- the Australian Broadcasting Corporation (ABC), the British Broadcasting Corporation (BBC), The Chicago Tribune, The Guardian, The Texas Tribune, and Zeit Online. We learn how to spot and hire good developers, how to engage people around a topic through and other events, how to collaborate across borders, and business models for data journalism.
The ABC is Australia’s national public broadcaster comprised of a myriad of journalism, data and computer professionals. Importantly they had reference group of journalists and others who they consulted on usual basis. They got a bulk of the data for specific projects from interactive maps (a common kind of file for geo spatial data) downloaded from government websites. In the long run, the team reported that the learnt that the data journalism project brought a lot of people into the room who do not normally meet at the ABC. Also, co-location of the team is vital.
The contributors at ABC added that the big picture thusly: some ideas big media organizations need to engage in capacity building to meet the challenges of data journalism; they need hackers and workshops where the secrets geeks, younger journalist, web developers and designers come out to play with more experienced journalists for skill sharing and monitoring.
At the BBC, the term “data journalism” broadly covers projects that use data to do one or more of the following;
1. Enable a reader to discover information that is personally relevant.
2. Reveal a story that is remarkable and previously relevant
3. Help the reader to be better understand complex issues.
On the BBC website, data has been used for over ten years to provide service and tools for the users. Simple tools they have succeeded in creating provide personally relevant snippets of information. These tools appeals to the time-poor, who may choose not to spend time over lengthy analysis. BBC also makes use of a data visualization team to combine great design with a clear editorial narrative to provide a compelling experience for the user.
The Chicago Tribune is made up of a band of hackers embedded in the newsroom; they work closely with editors and reporters to help research and report stories, illustrate stories online and build evergreen web resources.
Behind the scenes of the Guardian Data blog reveals that it was to be a small blog offering the full data sets behind our news stories but the reports from that blog have drawn many users today and effected policy change.
Zeit online, on the other hand, as its approach to data journalism launched the PISA based wealth comparison project to compare standards of living in different countries, using questionnaires filled by fifteen year olds to assess their living situation at home. “Transparency, credibility and user engagement” are the watchwords of this online journalistic platform. The handbook also offers a practical example of the impact Zeit Online has made.
How to Hire a Hacker
Owing to the job of journalists as power users and developers who think outside the box and so hackers need them to build context and help to make them relevant. It is a symbiotic relationship. Here are few ideas as suggested by Aron Pihofer:

  • Post on job websites
  • Contact relevant organizations
  • Join relevant groups/networks
  • Local internet communities
  • Hackathons and competitions

The handbook emphasizes that hiring a hacker or developer is not enough but he has to be good at his job and able to take initiative sometimes, rather than waiting to be spoon-fed with instructions.

Following the money: Data journalism and cross-border collaboration
Investigative journalists and citizens are interested in uncovering organized crime and corruption that affects lives of billions. To achieve this, there has to be utmost access to information. Any information needed may be just a click away but corrupt government officials and organized crime groups are doing all it takes to conceal sensitive information to hide their crimes. It is now the duty of the investigative journalists to expose them and disable their mechanism. There are guidelines, if followed that can lead to thorough investigative journalism:

  1. Think outside your country: Criminals and corrupt officials don’t keep their money in places that they have stolen it, to prevent suspicion. Thus, crime must be seen as global. There are databases available to aid this. For example, the investigative Dashboard enables journalists to follow the money across borders.
  2. Make use of existing investigative journalism networks: Investigative journalists are grouped in organizations all over the world, such as the Organized Crime and Corruption Reporting project, The African Forum for investigative Reporting, The Arab Reporters for investigative Journalism, and The Global investigative Journalism Network. Professional journalism platforms such as IINET and the likes of them can be used by investigative journalists for investigating stories, ideas and clues to factual information.
  3. Make use of technology and collaborate with hackers: There are many readymade software programs that can be used as tools for analyzing, gathering, or interpreting information and also there are programmers and hackers who know how to obtain and handle information and can assist with the investigative effort. Scraper wiki is one of the sites where journalists can ask programmers for help with extracting data from websites.

Using the guidelines above do not only give one access to information, it minimizes harm and ensures better protection for investigative reporters who work in hostile environments. It prevents the criminals from pinpointing a specific person who is responsible in exposing them because the journalists work with other colleagues abroad.
 The arguments that have been used to encourage journalists to undertake data journalism projects include:


  • Data projects don’t date
  • You can build on past work
  • Data journalism pays for itself

Data journalism may not work at first but will build trust and command readers over time. It is a gradual process to achieve your goal. For example, the Canadian media Conglomerate known as Thomson Reuters, started with one newspaper, bought up a number of well-known titles in UK and later on decided to leave the newspaper business. They have now grown based on the information services they provide. There are millions and billions of money to be made in providing specialized information.
    CHAPTER 3- CASE STUDIES
This chapter looks at several journalism projects and how data has been used to cover various issues effectively.
The opportunity gap
 It is a data journalism project that gives the data of how some states like Florida have levelled the field and offered rich and poor students roughly equal access to high level courses while some other state like Maryland, Kansas etc. offer less opportunity in districts with poorer families. The project lasted about 3months with six people working on the story and new application. This application is also integrated with Facebook and helps the reader find out more information about the school he/she is interested in.
A nine month investigation into European structural funds
In 2010, the Financial Times and Bureau of Investigative Journalism joined forces to investigate European structural funds. The aim was to review who the beneficiaries of European structural funds are and check whether the money was put to good use. This process was broken down to four steps which include;

  • Identify who keeps the data and how it is kept.
  • Download and prepare the data.
  • Create a data base.
  • Double checking and analysis.

The Eurozone meltdown
Judging of usefulness of a dataset can be very time consuming (Sard Slobin). The use of graphics in illustration comes in handy and with pie charts, useful information can be found and this would bring the crisis closer to the readers. A small family in Amsterdam and larger ones in Spain and Italy were needed in this data collection in order to hear from multiple generations to see how personal history shaped responses.

Covering the public purse with open spending
In 2007, Jonathan Gray proposed a project called ‘Where does my money go?’ to the Open Knowledge Foundation which aimed to make it easier for UK citizens to understand how public funds are spent. It enables users explore data from different sources using intuitive open source tools. With this project, journalists have acquired, represented, interpreted and present spending data to the public. It is accessible to anyone and allows the reader make visualizations where necessary.

Finnish parliamentary elections and campaign funding
     As the book highlighted, there have been ongoing trials related to the election campaign funding of the Finnish general election of 2007, leading to stricter laws regarding election funding. To put this matter in perspective for the public, the data journalists followed these steps:

  • Find data and developers
  • Brainstorm for ideas
  • Implement the idea on paper and on the web
  • Publish the data these are all the processes under this

Electoral hacking in real time (Hacks / hacker Buenos Aires)
      This is a political analysis project that visualizes data from the provisional ballot results of the October 2011 elections in demographic statics from across the country. In this section of the handbook, tools used, how they were developed, the pros and cons of those tools and the implication of the Electoral Hack are explored.

Data in the News: WikiLeaks
      The WikiLeaks war logs comprised of 92,201 rows of data, each containing a detailed breakdown of a military event in Afghanistan but a team of specialist reporters could get great human stories from the information, analyze it and get the big picture, to show how the war really is. In December 2010, cables were released; it was a huge data base of official documents 251,287 dispatches from more than 250 worldwide US embassies and consulates. It came via a huge secret internet protocol Reuter Network SIPRNET.
Mapa76 Hackathon
     The desire to scrape large volumes of data from the web and then to represent it visually gave birth to this project. It helps users extract data and then to display it using maps and time lines. The Mapa76 is a tool that helps ease access to information which is otherwise difficult for researchers, journalist, human rights organization and others to process.
The Guardian Data blog’s coverage of the UK Riots
     In 2011, there were series of riots in the UK, politicians, in their own opinions attributed them to various causes. The Guardian and the London school of Economics set up a project “Reading the Riots” in order to address the issue.
Illinois school report cards
Each year, the Illinois state board of education releases school “report cards” data on the demographic performance of all the public schools in Illinois. So much data was produced and choosing what data to present became an issue. Eventually they worked with the reporters and an editor from the education team to pore over the interesting data.
Hospital Billing
     Investigative reporters at the California watch received tips that a large chain of hospitals in California might be systematically gaining federal Medicare program that pays for the cost of medical treatments of Americans aged 65 or older. However, there was an alleged scam of up coding, which means that patients were reported to have more complicated conditions which entailed a higher reimbursement that was necessary, fortunately the California department of health has public record that gives very detailed information about treated cases in all the state hospitals. Due to the volume of information, summarizing becomes an issues, however to summarize stories like allegations of other sources who have their own agenda it becomes easy.
Care Home Crisis
     Financial Times carried out an investigation into how private care home industry exploited the elderly and their families. The approach to uncovering this exploitation included extensive data cleaning, using Excel. Desk and phone research came in handy as well. Other tips learnt from this project include:

  1. Make sure you keep notes on how you manipulate the original data.
  2. Keep a copy of the original data and never change the original.
  3. Check and double-check the data. Do the analysis several times (if need be, from scratch).
  4. If you mention particular companies or individuals, give them a right to reply.

The tell-all telephone
      An understanding is given of what can be done with the data provided by our mobile phones. The details on a cellphone when viewed individually do not make sense but an aggregate can reveal a person’s habits and preferences. This aggregate was the basis of the Zeit online’s interactive map which earned the media company an ONA, the first to be awarded to a German news website.
Which car model? MOT failure rates
     BBC obtained data about the MOT pass and fail rate for different models and make of cars. After some process-analysis was done on the figures, and focus was given to the most popular model cars of the same age, a tangible difference was revealed. The data obtained was converted into a spreadsheet for proper analysis, and to report the conclusion an excel spreadsheet was produced which gave everyone else access to the data in a usable form.
Bus subsidies in Argentina
     During the bus subsidy saga in 2002 in Argentina, the contributor of this section and a colleague were discussing how to start their own data journalism operation. The editor of their financial section suggested that they started with the subsidies data published and they were able to make the data easily accessible to all Argentinians. They began by calculating how much bus companies receive monthly and this was achieved by looking at published data, they teamed up with a senior programmer to develop a scraper. Identifying how much the monthly maintenance of a public transport, eventually they took the case to a hackathon organized by hackers in Boston where Matt Perry developed the “PDF spy”. The project required seven journalists, programmers and an interactive designer worked on the investigation for 13months. They used Excel, Macros, Tableau public and the Junar Open Data Platform and other basis tools.
 Citizen data reporters
     The same skills used for data journalism can also help citizen reporters. Some of the new citizen reporters after taking part in twelve workshops began to demonstrate how the concept of accessing publicly available data in small town can be put to practice. Their findings raised many questions initially after series of experiments it was made obvious that data can be used by citizen reporters, and you don’t need to be in a large newsroom to use data in your article.
The big board for election result
     Election results tables tells stories of what people want to hear. Shan Carter of the Graphic Desk from New York Times came up with the right answer which was called “big board”. In order not to confuse people, the journalists did not step away from what they expect, because first of all their attention will be drawn to the big bar showing the Electoral College votes at the top, this is done so quickly, simply and without any visual noise.
  Visual journalism differs from other forms of design because visual journalism is meant to be beautiful and is informative. This also allows the reader go deeper for more details like state by state vote percentages etc. All of this makes the “big board” a great piece of visual journalism.
Crowdsourcing the price of water
     This had to do with people giving data of what they are facing using an example with France that had tap water problem, so they were asked to scan their bills and they uploaded it to a particular website and then enter the price they paid for the tap water of which so many people responded. It also showed stakeholders like national water, overseeing bodies that there was a genuine concern at the grassroots about the price of tap water. This led to joined force with France liberties in its fight against corporate malpractice. Media organization can learn to partner with NGOs, ask for the source, set up a validation mechanism, keep it simple, target the right audience, choose your key performance indicators carefully and ask users to provide raw data.
  CHAPTER FOUR- GETTING DATA
The chapter guides on how to source for data on the web and the format they will most likely be presented in the process of searching.
It also highlights how to easily access data that is held by public entity. It states that it is accessible through the public affair person or the freedom of information act. Despite the seeming accessibility, obstacles and bureaucracy may delay receipt of the data needed, which can be the data may not be given in the format requested.
Quite a number of countries are launching data portals inspired by the U.S’s data, government and the U.K’s data gov.uk to promote the civic and commercial reuse of government information. An up to date global index of such sites can be found at http.//datacatalogs.orgl. Another handy site is the Guardian world government data, a meta search engine that includes many international government data catalogues. The data hub, a community driven resources run by the open knowledge is another.
A platform that makes it easy to find, share, and reuse openly available sources of data especially in ways that are machine automated is Scraperwiki, an online tool to make the process of extracting useful bits of data easier so they can be reused in other apps or rummaged through by journalists and researchers.
World Bank and United Nations data portals provide high level indicators for all countries, often for many years in the past. Another among the many alternatives suggested by the handbook is the UK data archive but it requires a subscription and cannot be reused or redistributed without asking permission first.
For a data journalist, it is imperative that he or she learns about the government, understanding the technical and administrative context in which governments maintain their information.
Also, the chapter highlights embarking a brief research about data, its existence, its availability, previous research on it, before filling a formal request. This will make one’s expectations realistic, because you know what to expect and what is expected of the journalist. This chapter advises the journalist to know his or her rights, such as the required time which he or she is meant to receive the requested data, the format. Journalists must make sure they know this before embarking on a search for data.    
If there is need to analyze, explore or manipulate data using the computer, then the data journalist should explicitly ask for data in an electronic machine readable format. For example, if requesting budgetary information, then a format that is suitable for analysis with accounting software is needed.
When sourcing for information, it is advisable to go web shopping as the diversity of freedom of information laws in Europe and other countries have different political interest at different time so it can be of advantage to the journalist.
It also highlight the importance of knowing your right when you are publishing data, you should worry about copy right and other rights in data because that data may belong to the government, organization and this should be put into consideration to avoid copy right infringement.
The chapter rounds up with an exploration of business models for data journalism. Stating with the example of Reuters and how valuable specialized information they provide is.
CHAPTER FIVE - UNDERSTANDING DATA
In order for a data journalist to get stories from data, the journalist must understand the data and in understanding the data, the data journalist must be data literate. Data literacy is the ability to consume knowledge, produce coherently and think critically about the data. Data literacy includes statistical literacy but also understanding how to work with large data sets and how they were produced, how to connect various data sets and how to interpret them.
     A reporter certainly does not need a degree in statistics to become more efficient when dealing with data. When faced with numbers, a few simple tricks can help the reporter get a much better story.
    The best tip for handling data is to enjoy oneself.  Treat it as something to play with and explore and it will often yield secrets and stories with surprising ease. Be creative by thinking of the alternative stories that might be consistent with the data and explain it better, then test them against more evidence. “What other story could explain this?” is a handy prompt.
        The handbook explains that a bit of skepticism around numbers is good because it makes the data journalism handle them with caution. There is also the question of how to legitimately cut the data.
     However, there are some basic steps to be considered while working with data. It involves three basic concepts. They are:

  • Data requests should begin with a list of question you want to answer: In many ways, working with data is like interviewing a live source. You ask questions of the data and get it to reveal the answers. Just as a source can give answers about which he has information, a dataset can only answer questions for which it has the right records and the proper variables. This means that you should consider carefully what questions you need to answer even before you acquire your data. First, list the data evidenced statements you want to make in your story. Then decide which variables and records you would have to acquire and analyze in order to make those statements. It’s often a good idea to request for all the variables and records in the database, rather than the subset that could answer the questions for the immediate story. You can always subset the data on your own and having access to the full dataset will let you answer new questions that may come up in your reporting and even produce new ideas from follow-up stories.
  • Cleaning messy data: One of the biggest problems in database work is that often you will be using data gathered for bureaucratic reasons for analysis. The problem is that the standard of accuracy for those two is quite different. The first piece of work a data journalist is to undertake when you acquire new dataset is to examine how messy it is and then clean it up. A good quick way to look for messiness is to create frequency tables of the categorical variables, the ones that would be expected to have a relatively small number of different values.
  • Data may have undocumented features: The data dictionary will tell how the data file is formatted, the order of the variables, the names of each variable and the data type of each variable. You will use this information to help you properly import the data file into the analysis software you intend to use. The other key element of data dictionary is an explanation of any codes being used by particular variables. It is advised to always ask the agency or source giving you data if there are any undocumented elements in the data, whether it is newly created codes that haven’t been included in the data dictionary changes in the layout or anything else.

 Start with the data, finish with a story
To draw your readers in, you have to be able to hit them with a headline figure that
makes them sit up and take notice. Stories can be constructed by approaching the dataset with specific queries in mind. Look through the data for key terms. Don’t rely on keywords though, that can be might be misleading.
Using visualization to discover insight
It is unrealistic to not expect that data visualization tools and techniques will unleash a barrage of ready- made stories from data sets. There are no rules, no protocols that will guarantee us a story. Instead, it makes more sense to look for “insight” which can be artfully woven into stories in the hands of a good journalist.
        Each new visualization is likely to give us some insights into your data. Some of these insights might be already known, while other insights might be completely new or even surprising to us, some insights might mean the beginning of a story, while other could just be the result of errors in the data, which are most likely to be found by visualizing the data.
How to visualize data
Visualization provides a unique perspective on the data set. You can visualize data in lots of different ways. Tables are very powerful when you are dealing with a relatively small number of data points. They show labels and amount in the most structured and organized fashion and reveal their full potential when combined with the ability to sort and filter the data. However, tables have their limitations; they are great to show you one-dimensional outliers like the top 10, but they are poor when it comes to comparing multiple dimensions at the same time.
Charts in general allow you to map dimensions in your data to visual properties of geometric shapes. They are perfect for comparing categorical data.
Graphs are all about showing the interconnections (edges) in your data points (nodes). The position of the nodes is then calculated by more or less complete graph layout algorithms which allow us to immediately see the structure within the network. The trick of graph visualization in general is to find a proper way to model the network itself. Not at all, data set already includes relations and even if they do, it might not be the most interesting aspect to look at. Sometimes it’s up to the journalist to define edges between nodes.
 Analyze and interpret what you see
Once you have visualized your data, the next step is to learn something from the picture you created. You could ask yourself:

  • What can I see in this image? Is it what I expected?
  • Are there any interesting patterns?
  • What does this mean in the content of the data?

Document your insights and steps
Documentation is your travel diary. It will tell you where you have traveled to, what you have seen there and how you made your decisions for the next steps. You can start your documentation before taking your first look at the data.
Documentation is the most important step of the process. The described process involves a lot of plotting and data wrangling.
Transformation of data
With the insights gathered from the last visualization, one might have an idea of what you want to see next. You might have found some interesting patterns in the dataset which you now want to inspect in more details. Possible transformations are zooming, aggregation, filtering.
Visualization tool selection
There is a wide number to choose from, such as:

  • Spread sheet like Libra office, Excel of Google docs.
  • Statistical programming frame works like R
  • Geographic information systems.
  • Data wrangling tools like Google Refine or data wrangler.
  • Visualization libraries perfuse.
  • Non-programing visualization software like many eyes.      

CHAPTER SIX- DELIVERING DATA
When presenting data to the public, the data journalist is confronted with the question of whether or not to visualize and how. The opinions of various data journalists of reputable media organizations are sampled and ultimately, the handbook advises:

  • Using motion graphics
  • Telling the world by building oneself as a reputable data bank
  • Publishing the data in a way that the reader is most able to relate with it
  • Opening up the data in line with social science traditions so that anyone else can verify it
  • Starting an open data platform so as to show how the data was gotten and used so that readers who might be interested can reuse it.
  • Making the data human by never forgetting the human interest angle
  • Open data, open source, open news
  • Add a download link
  • Know your scope

How you deliver your data is very important often writing on an interesting topic and presenting this data to the public. There are lots of different ways which its crucial to follow that is: creating beautiful visualization and interactive web application, for bad data visualization is worse in many respect than all.
How to Build a News App
Insight is given in this handbook on how to build a news application. The form the app will take is subject to the developer but he or she must bear in mind that the format of the app will encourage the reader to interact with the data or leave it.
In building the app, attention must be given to:

  • The audience and their needs
  • How much time the developer has to build the app considering how south after he or she is in the newsroom
  • Building an app that is easy for the developer to upload on and that would at the same time be technologically impressive but easy to use as well.

Furthermore in this chapter, time is taken to explain how important visualization is in data journalism because they impart both the reporting phase and at the point of publishing. They show holes in the reports to the journalist while for the reader it can be a compelling way of understanding complex information. Sarah Cohen of Duke Univerisity suggested ten tips to aid the data journalist in this endeavour.
So much attention is given to visualizations because solely with video photos, one can tell help one see familiar data in a new way, show change over time, show connections and flows, design with data, show hierarchy, navigate large data bases and envision alternate outcomes.
However, care must be taken that despite the seeming multipurpose use of data visualization, that it is not misused. If there are few data points, little variability or no definite conclusion then it is best if that story is told using text.
 Different charts tell different tales
This section of the chapter highlights various powerful data visualizations used by established media organizations.
Some of the famous charts and graphs came out of the need to better explain dense tables of data. William Playfair was one of the foremost authors to use visualizations as far back as 1786. He introduced the bar chart to clearly show the import and export quantities of Scotland in a new and visual way. Till date visualizations remain just as important, if not more so.
Data visualization DIY: our tools on the web
  In this part, data visualization tools were listed which are easy to use and free. They are:
Google fusion tables
This is an online database and mapping tool which they use when it comes to producing quick and detailed maps, especially those where you need to zoom in. Here, one can get all the high resolution of Google maps but with huge chunk of data. The contributor mentioned that fusion layer tool allows one to bring different maps together or to create search and filter option, which can the embed on a blog or a site, so you don’t have to be a coder to make one.
Tableau public
Tableau public is free. One can make pretty complex visualizations with up to 100,000 rows simply and easily using this tool. It can be used to bring different types of charts together or as a date explorer, noting that tableau is designed for PC.
Google spreadsheets charts
This tool can be accessed at http://www.google.com/google-d-s/spreadsheets. The contributor stated that after something simple like a bar or a pie chart, one will find that Google spreadsheets can create good charts including the animated bubbles. There is no need to code unlike the charts API. It is quite similar to making a chart in excel, that is, by highlighting the data and clicking the chart widget. And also, the customization options are worth exploring too where you can change colors, headings and scales. They are pretty design-neutral which is useful.
Data market
Also known as data supplier, it a nifty tool for visualizing numbers too [http:/bit.ly/datamarket.explore]. It can be used to upload or use some of the many datasets they have to offer. It is noted that data market works best with time series data, but one can check out their extensive data range.
Many eyes
IBM’s [International Business Machine] which was created by Ferdinand B. Viegas and Martin Wattenberg allows people upload datasets and visualize them. Once any data is uploaded, it cannot be edited so it’s very crucial to go through before creating it.
Color brewer
Color brewer is not strictly a visualization tool but for choosing map colors. Here, base colors and even codes for the entire palette can be chosen.  The authors gave a list of other tools to choose from.
How we serve data at Verdens Gang
In this segment, the authors say that data journalism which is a form of new journalism should be about bringing new information to the reader as quickly as possible which could be done maybe through video, a photo, a text , a graph, a table or a combination of these. To further buttress this point, an example is given of how Verdens Gang [VG] serves data which may be through numbers, networks, maps, text mining and concluding notes.
Public data goes social
Unless data is contextual or is capable of sparking off a discussion then it does more harm than good. An example is given by Oluseun Onigbinde of BudgIT, an online platform that uses data visualization to engage people in public expenditure. He emphasized the need for feedback and peer to peer sharing.
Engaging people in your data
We are made to understand here that it is enough to dump data on the readers. The first step to achieving this is by building trust and ensuring that the content reflects what they would want, and that they are being listened to. The best platform for engaging your audience is social media. Also, data journalists are advised to publish the raw data along with the stories to enhance the experience for the readers and also sharpen the journalist as access to the raw data will spur some to pore over them and detect mistakes in the story, if there is any.
CONCLUSION
The field of data journalism or data-driven journalism is vast, requiring all hands from very different fields on deck. Data journalism is a veritable tool in shaping our societies. Contrary to what many may think, data journalism is not out chase away the traditional mode of journalism by story-telling, rather, it is here to complement it.
A pivotal point made in this handbook is that data journalism is not all about dumping figures and visuals on the audience but about finding the stories in the them as the relate to various audiences. In line with this, audiences relate better with stories with visualizations so as to make sense of the data. It is quite important as noted that after dissemination of the information, reaction from audiences should be taken to sharpen the incisive nature of the stories.    

David McCandles, in this 17-minute video, talks about the importance of visualization in data journalism at Ted Talks in 2010(https://www.ted.com/talks/david_mccandles_beauty_of_data_visualization/up-next)
Simon Rogers one of the contributors to the ‘Data Journalism Handbook: How Journalists can Use Data to Improve News’, in a powerful 9 minute video talks describes data journalists as the new punks at the TEDxPantheonSorbonne. (https://m.youtube.com/watch?v=h2zbvmXskSE)               

Comments