logo

Statistics a scientific tool for decision-making

Saturday, 27 December 2008


Shahjahan Khan
We are living in a global information society where the flow of information is ever increasing. Statistics plays a major role in shaping and providing scientific information that is useful in almost every aspect of human life. It has been very successful in making its mark from astronomy to administration to business to biology to housing to health -- you name it, statistics covers it. Modern decision-making, be it for an individual or for a business, a government or an international agency, is increasingly using statistical methods to improve the quality of information and decision. Without statistics, data is unable to reveal its hidden information to be used in decision-making to plan, monitor and assess development.
To prove the effectiveness of any new drug, to assess the impact of air or water pollution on the living creatures, to measure demographic or economic growth, to forecast economic outlook or growth of different types of crops, in short, for a hundred and one purposes, we have little choices but to use statistics in one way or the other.
Clearly, unlike any other discipline in science or arts, statistics has made a very successful and effective headway in almost every aspect of modern life, and its contribution is attracting universal recognition.
"Statistics have a key role to play in opening up government because of the role they play to holding the Government to account" said Jack Straw, (vide Statistics & Society, 2007, Royal Statistical Society News, p.5).
Statistics is an art as well as science of making decisions in the face of uncertainty. Decisions based on sound statistical findings are scientific, and hence enhances desirable outcomes. Many statistical methods are based on random samples to protect against personal or environmental biases affecting results. Both graphical and numerical statistics reveal the facts that are often unavailable in the mess of data. The hidden 'gold' in the raw data is exposed using appropriate statistical methods for the benefit of mankind. It is like scanning of the data to rescue apparently unavailable valuable 'gems' spread around us. Statistics puts information in the right perspective so that it is ready to be used in appropriate decision-making.
There are three main activities of statistics: (a) production of data through experimental or observational processes, (b) presentation of data using graphical and numerical methods, and (c) decision making on the characteristics of populations via estimation and tests. Production of data via experiments and sample surveys is a common practice. Scientifically produced data are often explored using graphs, charts and numbers. However, statistics goes far beyond the simple exploration of data. But only sound statistical methods could produce scientific data that is essential for the validity of any statistical inference. Moreover, it is useful for forecasting and predicting variables of interest from any data generating process. Furthermore, statistical methods are able to identify lurking variables and confounding factors to isolate the actual treatment effect so that valid conclusions on the cause and effect relationship between variables are possible.
"Statistics is the grammar of science," Karl Pearson (Statistics & Society, 2007, RSS News, p.16).
Statistics in action: Every developed nation has adopted statistical methods in local, regional and national planning. Statistics provides much needed benchmark for current state of the affairs, so that schemes could be undertaken to address any issues or to improve on specific aspects of any process of interest. Moreover, statistical methods are capable of assessing available resources which are essential for formulating and implementing effective development strategies, either for individuals or for groups. Furthermore, statistics is a vehicle to monitor the ongoing progress (or regress) and adjust the process to improve the system with a view to achieving the ultimate goals. It also allows comparison of performances of different techniques or treatments and checks the effectiveness of any such methods. Some statistical methods are also capable of putting confidence levels on its results.
Developed nations embraced statistics in the production industries as a method of achieving high quality of the manufacturing products, and hence minimise the risk of poor/defective products on the way to maximising the profit. The method allows timely intervention into the manufacturing process to make necessary adjustment to the machines to rectify any faults adversely affecting the quality of the products. As a result, the total quality management and statistical quality control are very much an integral part of the modern production industries. Statistical methods could also help determine appropriate warrantee period for manufacturing products to minimise the potential risk of loss for the producer.
Importance of statistics : The importance of statistics has been recognised by Japan by way of observing October 18 as 'the Statistics Day' every year. This day was fixed on July 03, 1973 by the Japanese Cabinet for the purpose of strengthening the national concern and understanding the importance of statistics and promoting their cooperation in the surveys held by the central and local governments. Notably, October 18 is the day when the Japanese Government proclaimed the order to compile its first modernistic statistics, 'Fuken Bussan-hyo' in 1870. The day was converted from the lunar calendar day, September 24, to the solar calendar day, October 18.
In the olden days, the kings and rulers used to use statistics, based on population census, primarily to procure food for the people and prepare the army for security. Census has been used by the Governments ever since for various purposes. The making of decisions on some characteristics of the entire population based on a representative part/sample of it, is a huge advantage, both in terms of cost and management, without sacrificing accuracy and efficiency of the results. In reality, often sampling, as opposed to census, is the only realistic choice.
Planning for socio-economic development requires statistics. How many doctors are needed over the next 10 years? How many hospitals or schools need building and in what period of time? How many police will be required to safeguard a community? How many graduates and technical persons should be trained for the next 20 years? What items and how much should be exported? What items to be imported? Answers to all of these questions, and many more, need sound statistics. Statistical data alone could help determine the needs, and only statistical methods would be useful to reassess/forecast the outcomes.
Governments are required to submit reports on varieties of statistics to different UN bodies as well as regional and international organisations. International standing of a nation often depends on these statistics. Classifications of developed and developing nations are also based on statistical data along with other things.
Every modern government runs a central statistical organisation such as Bureau of Statistics or Central Statistics Office. These offices are managed by trained statisticians of diverse specialisation. They are responsible for diagnosing the state of health of the national economy, industry, agriculture, trade, education, medical care etc. These offices prepare periodic reports on unemployment rate and consumers' price index, two unambiguous vital indicators of any national economy. The gross domestic product of a nation is also determined by appropriate statistical methods.
Fiscal management of a nation is also largely dependent on statistics. Central or Reserve banks set interest rate based on statistical data including current state of export/import, consumers' spending, currency exchange rate, consumers' price index, inflation rate and overall health of the economy.
Politicians also use statistics in assessing the popularity of their personal position and that of the party or government, using various surveys. Decisions on when the Government would go for the election is very much dictated by the outcomes of the opinion polls, often without admitting it in public. The so-called 'exit poll' is often used to forecast the outcome of an election before counting the actual vote. Normally, in the democratic nations, the governments would be very reluctant to call an election if the opinion poll is not strongly in their favour.
Every government in the Western countries uses statistics to formulate public policies. In many cases, decisions of these governments reflect simply the opinion of the people, which is measured by statistical means. It is a common practice to get popular mandate, relying on statistical measurements, before embarking on building national consensus on any serious matter of public interest related to national issues or security. Obviously, this is not the case in many developing countries, perhaps, due to the absence of state of the art of statistical system.
Unfortunately, abuse of statistics is not few and far between. Some people, particularly the ones in power, abuse statistics because they are neither willing nor prepared to face the reality and truth exposed by statistics. Seriously, this is not a problem of statistics, as such, rather, it's a problem of the ill motive of the users (rather abusers) of statistics. A knife can be utilized to get a cure if it is used by a surgeon, but it could very well be a killer equipment if abused by a murderer. Correct statistics are innocent, and, hence do not deserve any wrong labelling. Blames must go to the ones who abuse it, particularly, to those who do so knowingly, selectively or deliberately to mislead the common people for apparent personal gains.
Some popular statistical methods: There are many statistical methods that are frequently used in everyday to analyse data and make decisions that affect lives of millions of people world-wide. Often the importance and contributions of these methods are unknown and hence undervalued by the public at large. Highlights of a number of popular statistical methods are covered below.
One of the greatest contributions of statistics is the introduction of the sampling methods, enabling decision-making on any aspect/characteristic of the entire population/universe based on the data from a small, but typical part of the population. The simple, random sample, guaranteeing equal chance of inclusion of every individual/item in the population into the sample, is the theoretical foundation of sampling distribution of different Statistics. Commonly used inferences, such as construction of confidence intervals and performing test of hypotheses, are based on the idea of sampling distribution of statistics. Thus samples and sampling distributions are the keys to any statistical inference.
The central limit theorem (CLT) is a very powerful result that allows to approximate the distribution of the sample mean, if the sample size is large, even if the distribution of the population from where the sample is drawn is unknown. Surprisingly, the sampling distribution of sample mean is approximately normal/symmetric, even if the distribution of the parent population is highly skewed or flat/uniform.
Another remarkable contribution of statistics is the introduction of Randomised Controlled Trail (RCT) frequently used in the scientific and medical experiments. Data from RCT are used for varieties of statistical analyses and inferences. Combination of data from independent studies using RCT provides much needed 'larger' sample size to increase the power and validity of various statistical procedures.
The introduction of placebo and single and double blind experiments have changed the way effectiveness of drugs and medical procedures are conducted. Random allocation of treatment and placebo help isolate the effect of the treatment from the effect of any other factors. Implementation of double blind allocation of treatments and placebo remove any potential bias likely to be introduced due to personal choices, or favour.
Testing the significance in the mean difference treatment effect via analysis of variance (ANOVA) is another popular statistical technique used in different disciplines. The actual test is based on the ratio of a measure of spread between and within the treatments.
The linear regression model is probably the most frequently used statistical model used to predict the value of a dependent/response variable for the given values of a set of independent or explanatory variables. Nonlinear models are also available, if appropriate. The Pearson's product moment correlation coefficient measures the linear association between two quantitative variables. On the other hand, the association between two categorical variables is tested by the chi-square test of independence.
Opinion-poll is a well known term in the political circle of any developed democratic nation.
Pre-election polls are common in western democracy. Often, party in power relies on the opinion poll results to decide on the date of election to tap on the popular support. Exit poll is an increasingly used method to foretell the results of any election. Government policies on issues of public interest are determined based on the results of opinion polls, particularly when an election is not due. Now-a-days every established political party in a developed nation has its own pollsters to judge the public opinion on issues of national and international importance.
Shahjahan Khan, PhD is in the Department of Mathematics & Computing, Australian Centre for Sustainable Catchments, University of Southern Queensland, Toowoomba, Qld. 4350, Australia. He can be reached at Email: khans@usq.edu.au