Do X in Y

Do X in Y: Convert values to subject-relative z-scores in R

I set out to do something that seemed like it shouldn’t be too hard in R. I had a dataframe with RTs for a bunch of subjects, and I wanted to convert the RTs to z-scores relative to each subject’s own mean. To do this relative to the global mean is super easy:

data$global_zRT <- scale(data$RT)

However, getting it scaled by subject mean (which could be useful for visual inspection of data, or for some analyses) turns out not to be trivial, and I was unable to find relevant posts via google search. Before posting to stackoverflow myself, I tried our lab Slack channel. Dave Saltzman and Anne Marie Crinnion produced a solution quickly with dplyr. However, my dplyr calls were getting blocked by plyr. Anne Marie pointed out how to make the command bullet proof. Note that ‘subject’ here is a column in the dataframe, not a keyword of some sort.

data <- data %>% dplyr::group_by(subject) %>% dplyr::mutate(zRT = scale(RT))

— Jim Magnuson

Do X in Y: Install lens in Ubuntu linux

Doug Rohde’s lens (light, efficient, neural simulator) is an awesome tool. However, given that it has not been actively maintained since 2000, its shelf-life is probably limited. I still have some legacy projects that were developed in lens (mainly using SRNs) and like to be able to re-run and tweak them. At some point, I’ll move them to tensorflow, but in the meantime, if I can get them running on linux, that would be great.

My primary linux box is a virtual machine under VirtualBox on a Mac running Ubuntu 18.04.1 LTS. I got lens running here by consulting this page. My notes are a bit more compact than the details at that page, and actually add crucial details now that it is hard to find legacy packages for tcl/tk.

  1. Get tcl and tk packages:
  2. Install those guys, following instructions like these, to wit:
    • sudo apt install ./name.deb
    • replace ‘name.deb’ with a package name; I assume you should tcl8.3 first, followed by tcl8.3-dev, tk8.3, and tk8.3-dev
  3. Choose where you will install lens. Personally, I like easy access to it right off my home directory in a folder called LENS.
  4. Download the code to that directory and unpack it:
    • sudo wget http://tedlab.mit.edu/~dr/Lens/Dist/lens.tar.gz
    • sudo tar zxf lens.tar.gz
    • sudo rm lens.tar.gz
  5. Replace every instance of CLK_TCK with CLOCKS_PER_SEC in the files in Src; a one-line way of doing this from this page:
    • sed -i 's%CLK_TCK%CLOCKS_PER_SEC%g' ./Src/command.c ./TclTk/tcl8.3.4/unix/tclUnixPort.h
  6.  In Src/system.h, comment out the “include <bits/nan.h>” line; those functions have been integrated into math.h, which is also included. Not doing this leads to errors at compile.
  7. Edit the Makefile. Minimally, replace the line “CFLAGS = -Wall -O4 -march=i486” with “CFLAGS = -Wall -O4“. I also had some weird problems where it was generating a HOSTTYPE directory for i586 that would not work that went away when I simply commented out every other HOSTTYPE section except the default one. Inelegant, but it worked.
  8. Then build it: sudo make all 
    • If it didn’t work, I’m sorry. That’s all I’ve got…
  9.  Then deviate slightly from the installation directions. In your ~/.bashrc, add these lines (and save it and then start a new terminal or ‘source ~/.bashrc‘):
    • export LENSDIR=${HOME}/LENS # or whatever your location is
    • export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${LENSDIR}/Bin
    • export PATH=${PATH}:${LD_LIBRARY_PATH}:${LENSDIR}/Bin
  10. You should now be able to execute lens anywhere by typing ‘lens

Do X in Y: Use a function to save many plots to a list in R

Do X in Y: Use a function to save many plots to a list in R

You know that miserable feeling when you realize you are copying and pasting snippets of code and modifying them with hard-coded variables? I usually ignore it and press on, but last night, I took the time to convert my code to a function. It was surprisingly easy (it took maybe 5 minutes), and had unexpected benefits. For example, the task I was doing required creating many (e.g., 8-30) ggplot objects, pushing them into a list, and then using multiplot to create PDFs. Every time I ran a chunk of code for one set of graphs, I found myself fidgeting while R Studio created each of the graphs in the plot window. My efforts to suppress that unwanted plotting were for naught, but when I converted to a function, all that extra plotting went away. Creating a 16-panel plot is probably 10x faster using the function! Here’s the code for the function (apologies to any programmers whose sensibilities I offend; I’m a hack, not a hacker).


#####################################################################################
# 2018.04.30, Jim Magnuson
library(foreach)
library(ggplot2)
library(scales)

plot.to.list <- function(dat, x.vars, x.names, y.vars, y.names, textsize=12,

jitteramount=0.1, ...) {
someplots <- list()
at = 0
foreach(xvar=x.vars, xname=x.names) %do% {

foreach(yvar=y.vars,yname=y.names) %do% {

# First, get correlation between the current pair
# NB: the 'get' command is crucial for evaluating the
# strings as variable names, but this doesn't work in
# the ggplot code below
acor = sprintf("%.3f",with(dat, cor(get(xvar),get(yvar)) ))
at = at + 1 # increment list position
# now add a ggplot object to the list
# NB: instead of 'get', we use 'aes_string' in place of 'aes'
someplots[[at]] <- ggplot(dat,aes_string(x=xvar,y=yvar)) +

geom_jitter(position=position_jitter(jitteramount)) +
geom_smooth(method='lm', se=FALSE, linetype="dashed") +
scale_x_continuous(breaks= pretty_breaks()) + # from scales, try it, you'll like it
scale_y_continuous(breaks= pretty_breaks()) +
theme(plot.title = element_text(hjust = 1)) + # right justify title
theme(panel.background = element_rect(colour="black", fill="white"),
axis.text.x = element_text(size=textsize, face="plain", colour="black"),
axis.text.y = element_text(size=textsize, face="plain", colour="black"),
axis.title.x = element_text(size=textsize, face="bold", colour="black", vjust=-.4),
plot.title = element_text(size=11),
axis.title.y = element_text(size=textsize, face="bold", colour="black", vjust=1)) +
xlab(xname) + ylab(yname) +
ggtitle(paste("r =",acor)) # plot r as title

}

}
return(someplots)

}


####################################################################################

 

yvars=c("RT","RT_lenC") # variables w/in trace.sub I want to be on y axes
ynames=c("RT", "Adjusted RT") # Better labels than the variable names
xvars=c("NB", "DEL", "ADD", "SUB") # more trace.sub vars that I want on x axes
xnames=c("Neighbors", "Deletions", "Additions", "Substitutions") # Better labels
# now call the function:
nb.plots = plot.to.list(dat=trace.sub, x.vars=xvars, y.vars=yvars,
x.names=xnames, y.names=ynames)

# now create a PDF with all the plots
pdf("trace_neighbor_types.pdf",height=7,width=13.7)
# get multiplot function here:
#      http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/
multiplot(plotlist = nb.plots, cols=4)
dev.off()

Do X in Y: Randomize duration and F0 of speech files in Praat

Randomize duration and F0 of speech files in Praat

Summary: I describe a script that makes random modifications to duration and pitch. It can serve as an example for how to read in all .wav files in a directory and modify them from the command line.

For a machine learning project, we wanted to induce variability in a set of speech files without recording more tokens of our critical words. We thought, “let’s just use a Praat script to randomly jigger duration and F0 — that can’t be hard, can it?” Well, it is harder than I thought given that I don’t really grok Praat scripting — it is the most foreign and strange programming language I’ve ever encountered.

After consulting with various colleagues, I remembered that Sergey Kornilov had used a script to compress lots of files for a project. He sent me his script. I found a script by Shigeto Kawahara of Keio U. in Japan that modified pitch. Putting them together, I now have a script that is called from the command line and reads in all .wav files in a directory (the directory where the script is located), randomly compresses/expands the file according to a hard-coded range (i.e., you have to change the values in the file), randomly adjusts pitch (by multiplying by a scalar), and writes the resulting file in a folder called output that must exist in the starting directory. The file names include tags indicating how duration and pitch were modified. Here’s the script. I hope someone else finds it helpful! You can copy the code and paste it into a text document that you might call something like vary_duration_and_pitch.praat.

— jim magnuson

# 2017.11.17, Jim Magnuson, based on a compression script by
# Sergey Kornilov and
a pitch modification script by
# Shigeto KAWAHARA of Keio U.

# (http://user.keio.ac.jp/~kawahara/scripts/changeF0_e.praat)
# kludgy script to shift pitch by random factor between
# minF0 and maxF0
and compress/expand by random factor between
# minComp and maxComp. 

# It does this for every .wav file in the directory where the
#
script is located, and writesresulting files in ./output
# (so you must create a folder called output; note

# that the script will overwrite files in that directory
# without warning!).

# The output files have _dur_X and _f0_Y inserted in their
# names. For example,
_dur_87 means compression to factor of
# 0.87, _dur_108 means compression
(expansion) to factor of
# 1.08, etc.

# set locations
inputDir$ = "./"
outputDir$ = "./output/"

# set ranges for pitch and compression
minF0 = 0.5
maxF0 = 2.0
minComp = 0.5
maxComp = 2.0

# used to incorporate duration and F0 changes
# in out file name

resolution = 100

# read *.wav filenames into strings
strings = Create Strings as file list: "list", inputDir$ + "/*.wav"
numberOfFiles = Get number of strings

# now loop through .wav list
for ifile to numberOfFiles

# open file in position ifile in string list
filename$ = Get string... ifile

# give a little info on the console re: progress
appendInfoLine: "Working on " + inputDir$ + filename$

# read the actual file
Read from file: inputDir$ + filename$

# set random duration factor
duration_scalar = randomUniform(minComp, maxComp)

# LENGTHEN AND RESYTHESIZE IN ONE STEP USING PSOLA
# 75 and 600 are standard parameters for the
# frequency range (hz)
used in periodicity analysis;
# the last argument is the compression factor

Lengthen (PSOLA)... 75 600 duration_scalar

# RANDOMIZING F0
# this is going to require Manipulation commands,
# so we need to
select the sound; let's name it
this_sound$ = selected$ ("Sound")

# grab it + give params for onset of analysis window and
# frequency range used for periodicty analysis
To Manipulation... 0.01 75 600      

# pop pitch tier into memory (apparently)
Extract pitch tier

# set random pitch change factor
pitch_scalar = randomUniform(minF0, maxF0)

# modify pitch of the pitch tier in memory
Formula... self * pitch_scalar;

# reselect the sound and replace pitch tier
select Manipulation 'this_sound$'
plus PitchTier 'this_sound$'
Replace pitch tier

# reselect the sound??? and resythesize
select Manipulation 'this_sound$'
Get resynthesis (PSOLA)

# SAVE RESULTING FILES
# create rounded versions of the pitch and
# duration factors for use in filename

pitch_scalar_rounded = round(pitch_scalar * resolution)
duration_scalar_rounded = round(duration_scalar * resolution)
filenameDur$ = " 'filename$'" - ".wav" + "_dur" + "_" +    string$(duration_scalar_rounded) + "_f0" + "_" + string$(pitch_scalar_rounded) + ".wav"

# update user and write to file
appendInfoLine: "* writing to " + outputDir$ + filenameDur$Write to WAV file: outputDir$ + filenameDur$
select Strings list

endfor

select all
Remove

# end of script