Saturday 13 October 2012

Creating automatically series of repeated sequential numbers in Excel

Even though excel can be very handy for many user-types and for many different applications, there are some cases where it seems that its developers were not carrying much about the built-in functionalities. As a result, they probably considered that all user are able to produce small pieces of code or are born with programming way of thinking, which obviously is not the case in our everyday life. After this small introduction, let's go straight to the point:

I'll summarise here three different ways of producing automatically and with very little effort a series of repeated sequential numbers. What does that mean exactly?
- Sequential numbers: Even though it is self-explanatory, obviously we're talking about a series of (user specified) numbers, that are within a specified interval,and there are no numbers missing in the sequense. e.g. 0,1,2,3,4,5,6,7,8,9 is a series of 10 sequential numbers in [0,9]
- Repeated: with this we imply that each number is repeated X times before we move to the next digit. In our example, if we wanted to repeat X=3 times, our series of repeated sequential numbers is 0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8,9,9,9.

Obviously, if you have such a limited series, you don't need any kind of automation, you just go to the related excel cells and fill in the numbers, period. BUT what do you do if you have a series of 85 numbers that are repeated 10.000 times? Obviously, you cannot fill in those 850.000! lines manually.

While struggling to find a solution and with quite some time spent on the internet, i've figured out the following three solutions, in order of preference for their use: 
  • using the command 'MOD', which provides the remainder of a number when it's divided with a specific denominator, nested in an 'IF' conditional.
  • using the command 'INT', which rounds downwards a given number, to the closest integer.
  • using an 'IF' conditional formula.
In detail, we will use as an example a series with 20 numbers repeated for 10 times (remember: how many times these numbers are repeated is irrelevant to the formulas needed to input in excel). It's easier to just calculate those numbers in a separate sheet, then copy only the values to your destination cells, and discard the formulas:
  1.  For the first solution we follow these steps:
    - Insert the first number, e.g. 0, in our first cell, e.g. A1
    - Insert in the cell below, e.g. A2, the following command: =IF(MOD(ROW(),20)=1,A1+1,A1)
    - Drag down the formula to the desired number of cells, e.g. till A200 in our example.
    - Obviously, the only adjustable parameter is the number '20' which can be set according to the needs of the user.
  2. For the second solution we follow this step:
    - Insert in the first cell, e.g. A1, the command: =INT((ROW()+19)/20)
    - Drag down the formula to the desired number of cells, e.g. till A200 in our example.
    - In this case, the adjustable parameters are obvious, '19' and '20'.
  3. For the third solution we follow these steps:
    - Insert a zero (0) in each cell of A1:A20 (twenty cells)
    - Insert in the cell below, e.g. A21, the following formula: =IF(A20=E1,A20+1,A20)
    - Drag down the formula to the desired number of cells, e.g. till A200 in our example.
    - This solution is obviously harder to update and the least effective of the three, but i kept it in the list for the crazy ones that may prefer it.
Regarding the first two approaches, they work equally well (like a charm) and have saved me plenty of precious time.

This way of producing repeatable sequential numbers can be very handy for anyone working on Monte Carlo methods, uncertainty analysis and/or sensitivity analysis of models as well as for any other data handling purpose. It is well expected that anyone could argue and point out "Why work on excel for a Monte Carlo application instead of Matlab or R" and I could immediately agree. But this discussion is out of the scope of my post.

For any questions or remarks, feel free to leave a comment or contact me by email.

Thursday 11 October 2012

Update (and justifications)

It's interesting to notice that my last post on the blog was published in October 2011...and now we've already reached October 2012! There are two interpretations of this fact: the optimistic and the pessimistic one....read on.

Obviously the pessimistic would be that "nothing worth to share with the world has happened during this one year of my PhD programme and no valuable work has been produced". Luckily, this is not the case and we are on the optimistic side: There have been numerous interesting things happening, such as seminars, conferences, projects, long fights with MS Excel, interesting results produced, submittions of articles and abstracts, collaborations with other PhD students of my group, non-academic activities, lots of travelling, weddings and kids everywhere...and thus, finding time to report, on a regular basis, on everything that has happened was practically impossible (given my origin and the strong forces of procrastination that rule this world)!

To my atonement, some posts have been saved as drafts for months on the blog and were not published on time (or never if you prefer), other were kept forever in the word document containing 'updates for the blog' and so on. 

Hopefully, i'll be able to produce a summary of only the really interesting things and make few posts in the next days. Please bear with me and thank you for the interest in my blog :)

PS. I won't promise (cause i hate breaking promises) but my idea of providing small and useful articles on MCDA, uncertainty handling, sensitivity analysis, challenges with excel and more, is still vivid and hopefully will be implemented soon. The first two topics will be excel related and specifically on the 'generation of random weights for Monte Carlo analysis' and 'generation of long series of repeatable numbers'.