OddThinking

A blog for odd things and odd thoughts.

Common Phrase Experiment

Did you ever notice that some people (all people?) have certain phrases that they use a disproportionately high percentage of the time?

You may find in infuriating or endearing – probably depending on what you think of the person anyway.

I was wondering what my commonly-repeated phrases might be.

Then I realised I had a large corpus of my writing on this blog, and I may be able to do an experiment to find out.

To any linguists reading: I know, I know. Writing ≠ conversation. I need a control to compare. Phrase ≠ series of words in a row. I need a bigger sample size. Go with me here; I am just playing.

Here’s the result:

This blog contains (as of earlier today): 197,514 words in my posts.

Of that, I use 13,109 unique words – arguably that’s related to my vocabulary size. I am sure the linguists have some better definitions of vocabulary size.

The most popular words are: the (6% of all words), to, a, I, of, and, it, that, is, in.

That tells us more about English than me. Let’s try a longer phrase.

My most popular four word phrases:

  1. I am going to
  2. I don’t want to
  3. the rest of the
  4. the size of the
  5. I am not sure

It seems I promise a lot (“I am going to”), I whinge a lot (“I don’t want to”) and I don’t know much (“I am not sure”). Sounds like a good characterisation to me!

This experiment looks like it is working! Let’s go to phrases of five words.

My most popular five word phrases:

  1. request timed out request timed
  2. timed out request timed out
  3. out request timed out request
  4. round trip time to ms
  5. time to ms round trip
  6. trip time to ms round
  7. ms round trip time to
  8. to ms round trip time
  9. series of nostalgic reminiscences about
  10. of nostalgic reminiscences about the

Unfortunately, after four words, the experiment breaks down.

The first eight lines above represent the output of the ping application, which has lots of repeated phrases. The last two are a repeated refrain I used in the introduction to a series of articles about the Rational 1000, in order to link them together.

The ping output and Rational 1000 stock phrase dominate up to 10-word phrases, which is where I stopped the experiment.

So, I guess my New Year’s Resolution is to try to cut down on my repetition of the words “Request timed out! Request timed out!” in order to make my conversation more interesting.


Comment

  1. In a hotel in Serbia, there is a sign on all TV-sets that says: “If set breaks, inform manager. Do not interfere with yourself”

    Julian: I (we?) like you as you are. Do not interfere with yourself!

Leave a comment

You must be logged in to post a comment.