R Function Usage Frequencies, Take 2

Yesterday, Hadley Wickham commented on my post on the frequency of calling various R functions that it would be helpful to have the number of packages that call a function in addition to the number of times that the function is called. I compiled the relevant data last night: you can grab it here This data set includes a row for every function I found, indexed by each of the packages and files in which it was used. At this higher level of resolution, I record the number of times each function was called.

To get a sense of the correspondence between these measures, below I’ve plotted the number of packages using each function in my data set against the log number of times each function is called:

package_function_frequencies.png

And here’s a new top 25 most called functions table:

Function Packages Using Function
function 1903
if 1846
c 1795
length 1791
list 1679
for 1656
return 1559
stop 1538
paste 1526
rep 1512
matrix 1419
is.null 1413
sum 1396
max 1368
cat 1308
names 1297
is.na 1241
min 1216
cbind 1175
nrow 1158
sqrt 1157
t 1134
print 1120
class 1120
seq 1098

3 responses to “R Function Usage Frequencies, Take 2”

  1. Tony Breyal

    Very interesting, did you use an R package to collect all this data, or is it a custom script or maybe another program? Looking at your first post, the only function i’ve never actually used is ?attr, but given it’s popularity i’m starting to think i should learn exactly what it is.

  2. Dario Solari

    Very interesting!
    In the day you published this thread i was thinking about a text mining approach (with “tm” package) to perfrom a classification of R-coding examples. This may be useful as search engine criterion in a website that contains R example. Better than tagging, i suppose.
    (we are developing a RUG’s website, for Italy)

    What did you think about?