Do X in Y

Do X in Y: Convert values to subject-relative z-scores in R

Posted on February 9, 2021November 27, 2022 by James Magnuson

I set out to do something that seemed like it shouldn’t be too hard in R. I had a dataframe with RTs for a bunch of subjects, and I wanted to convert the RTs to z-scores relative to each subject’s own mean. To do this relative to the global mean is super easy:

data$global_zRT <- scale(data$RT)

However, getting it scaled by subject mean (which could be useful for visual inspection of data, or for some analyses) turns out not to be trivial, and I was unable to find relevant posts via google search. Before posting to stackoverflow myself, I tried our lab Slack channel. Dave Saltzman and Anne Marie Crinnion produced a solution quickly with dplyr. However, my dplyr calls were getting blocked by plyr. Anne Marie pointed out how to make the command bullet proof. Note that ‘subject’ here is a column in the dataframe, not a keyword of some sort.

data <- data %>% dplyr::group_by(subject) %>% dplyr::mutate(zRT = scale(RT))

— Jim Magnuson

Do X in Y: Install lens in Ubuntu linux

Posted on October 25, 2018October 25, 2018 by James Magnuson

Doug Rohde’s lens (light, efficient, neural simulator) is an awesome tool. However, given that it has not been actively maintained since 2000, its shelf-life is probably limited. I still have some legacy projects that were developed in lens (mainly using SRNs) and like to be able to re-run and tweak them. At some point, I’ll move them to tensorflow, but in the meantime, if I can get them running on linux, that would be great.

My primary linux box is a virtual machine under VirtualBox on a Mac running Ubuntu 18.04.1 LTS. I got lens running here by consulting this page. My notes are a bit more compact than the details at that page, and actually add crucial details now that it is hard to find legacy packages for tcl/tk.

Get tcl and tk packages:
- tcl: http://old-releases.ubuntu.com/ubuntu/pool/universe/t/tcl8.3/
  - Get both tcl8.3_8.3.5-14_amd64.deb and the corresponding tcl8.3-dev package.
- tk: http://old-releases.ubuntu.com/ubuntu/pool/universe/t/tk8.3/
  - Get matching tk8.3 and tk8.3-dev packages
Install those guys, following instructions like these, to wit:
- sudo apt install ./name.deb
- replace ‘name.deb’ with a package name; I assume you should tcl8.3 first, followed by tcl8.3-dev, tk8.3, and tk8.3-dev
Choose where you will install lens. Personally, I like easy access to it right off my home directory in a folder called LENS.
Download the code to that directory and unpack it:
- sudo wget http://tedlab.mit.edu/~dr/Lens/Dist/lens.tar.gz
- sudo tar zxf lens.tar.gz
- sudo rm lens.tar.gz
Replace every instance of CLK_TCK with CLOCKS_PER_SEC in the files in Src; a one-line way of doing this from this page:
- sed -i 's%CLK_TCK%CLOCKS_PER_SEC%g' ./Src/command.c ./TclTk/tcl8.3.4/unix/tclUnixPort.h
In Src/system.h, comment out the “include <bits/nan.h>” line; those functions have been integrated into math.h, which is also included. Not doing this leads to errors at compile.
Edit the Makefile. Minimally, replace the line “CFLAGS = -Wall -O4 -march=i486” with “CFLAGS = -Wall -O4“. I also had some weird problems where it was generating a HOSTTYPE directory for i586 that would not work that went away when I simply commented out every other HOSTTYPE section except the default one. Inelegant, but it worked.
Then build it: sudo make all
- If it didn’t work, I’m sorry. That’s all I’ve got…
Then deviate slightly from the installation directions. In your ~/.bashrc, add these lines (and save it and then start a new terminal or ‘source ~/.bashrc‘):
- export LENSDIR=${HOME}/LENS # or whatever your location is
- export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${LENSDIR}/Bin
- export PATH=${PATH}:${LD_LIBRARY_PATH}:${LENSDIR}/Bin
You should now be able to execute lens anywhere by typing ‘lens‘

Do X in Y: Use a function to save many plots to a list in R

Posted on May 1, 2018May 12, 2018 by James Magnuson

Do X in Y: Use a function to save many plots to a list in R

You know that miserable feeling when you realize you are copying and pasting snippets of code and modifying them with hard-coded variables? I usually ignore it and press on, but last night, I took the time to convert my code to a function. It was surprisingly easy (it took maybe 5 minutes), and had unexpected benefits. For example, the task I was doing required creating many (e.g., 8-30) ggplot objects, pushing them into a list, and then using multiplot to create PDFs. Every time I ran a chunk of code for one set of graphs, I found myself fidgeting while R Studio created each of the graphs in the plot window. My efforts to suppress that unwanted plotting were for naught, but when I converted to a function, all that extra plotting went away. Creating a 16-panel plot is probably 10x faster using the function! Here’s the code for the function (apologies to any programmers whose sensibilities I offend; I’m a hack, not a hacker).

##################################################################################### # 2018.04.30, Jim Magnuson library(foreach) library(ggplot2) library(scales)

plot.to.list <- function(dat, x.vars, x.names, y.vars, y.names, textsize=12,

jitteramount=0.1, ...) { someplots <- list() at = 0 foreach(xvar=x.vars, xname=x.names) %do% {

foreach(yvar=y.vars,yname=y.names) %do% {

# First, get correlation between the current pair # NB: the 'get' command is crucial for evaluating the # strings as variable names, but this doesn't work in # the ggplot code below acor = sprintf("%.3f",with(dat, cor(get(xvar),get(yvar)) )) at = at + 1 # increment list position # now add a ggplot object to the list # NB: instead of 'get', we use 'aes_string' in place of 'aes' someplots[[at]] <- ggplot(dat,aes_string(x=xvar,y=yvar)) +

geom_jitter(position=position_jitter(jitteramount)) + geom_smooth(method='lm', se=FALSE, linetype="dashed") + scale_x_continuous(breaks= pretty_breaks()) + # from scales, try it, you'll like it scale_y_continuous(breaks= pretty_breaks()) + theme(plot.title = element_text(hjust = 1)) + # right justify title theme(panel.background = element_rect(colour="black", fill="white"), axis.text.x = element_text(size=textsize, face="plain", colour="black"), axis.text.y = element_text(size=textsize, face="plain", colour="black"), axis.title.x = element_text(size=textsize, face="bold", colour="black", vjust=-.4), plot.title = element_text(size=11), axis.title.y = element_text(size=textsize, face="bold", colour="black", vjust=1)) + xlab(xname) + ylab(yname) + ggtitle(paste("r =",acor)) # plot r as title

}

} return(someplots)

}

####################################################################################

yvars=c("RT","RT_lenC") # variables w/in trace.sub I want to be on y axes ynames=c("RT", "Adjusted RT") # Better labels than the variable names xvars=c("NB", "DEL", "ADD", "SUB") # more trace.sub vars that I want on x axes xnames=c("Neighbors", "Deletions", "Additions", "Substitutions") # Better labels # now call the function: nb.plots = plot.to.list(dat=trace.sub, x.vars=xvars, y.vars=yvars, x.names=xnames, y.names=ynames)

# now create a PDF with all the plots pdf("trace_neighbor_types.pdf",height=7,width=13.7) # get multiplot function here: # http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/ multiplot(plotlist = nb.plots, cols=4) dev.off()

Do X in Y: Randomize duration and F0 of speech files in Praat

Posted on November 20, 2017May 2, 2018 by James Magnuson

Randomize duration and F0 of speech files in Praat

Summary: I describe a script that makes random modifications to duration and pitch. It can serve as an example for how to read in all .wav files in a directory and modify them from the command line.

For a machine learning project, we wanted to induce variability in a set of speech files without recording more tokens of our critical words. We thought, “let’s just use a Praat script to randomly jigger duration and F0 — that can’t be hard, can it?” Well, it is harder than I thought given that I don’t really grok Praat scripting — it is the most foreign and strange programming language I’ve ever encountered.

After consulting with various colleagues, I remembered that Sergey Kornilov had used a script to compress lots of files for a project. He sent me his script. I found a script by Shigeto Kawahara of Keio U. in Japan that modified pitch. Putting them together, I now have a script that is called from the command line and reads in all .wav files in a directory (the directory where the script is located), randomly compresses/expands the file according to a hard-coded range (i.e., you have to change the values in the file), randomly adjusts pitch (by multiplying by a scalar), and writes the resulting file in a folder called output that must exist in the starting directory. The file names include tags indicating how duration and pitch were modified. Here’s the script. I hope someone else finds it helpful! You can copy the code and paste it into a text document that you might call something like vary_duration_and_pitch.praat.

— jim magnuson

# 2017.11.17, Jim Magnuson, based on a compression script by # Sergey Kornilov anda pitch modification script by # Shigeto KAWAHARA of Keio U.

# (http://user.keio.ac.jp/~kawahara/scripts/changeF0_e.praat)
# kludgy script to shift pitch by random factor between # minF0 and maxF0and compress/expand by random factor between # minComp and maxComp.
# It does this for every .wav file in the directory where the #script is located, and writesresulting files in ./output # (so you must create a folder called output; note
# that the script will overwrite files in that directory # without warning!).
# The output files have _dur_X and _f0_Y inserted in their # names. For example,_dur_87 means compression to factor of # 0.87, _dur_108 means compression(expansion) to factor of # 1.08, etc.

# set locations
inputDir$ = "./"
outputDir$ = "./output/"

# set ranges for pitch and compression
minF0 = 0.5
maxF0 = 2.0
minComp = 0.5
maxComp = 2.0

# used to incorporate duration and F0 changes # in out file name
resolution = 100

# read *.wav filenames into strings
strings = Create Strings as file list: "list", inputDir$ + "/*.wav"
numberOfFiles = Get number of strings

# now loop through .wav list
for ifile to numberOfFiles

# open file in position ifile in string list
filename$ = Get string... ifile

# give a little info on the console re: progress
appendInfoLine: "Working on " + inputDir$ + filename$

# read the actual file
Read from file: inputDir$ + filename$

# set random duration factor
duration_scalar = randomUniform(minComp, maxComp)

# LENGTHEN AND RESYTHESIZE IN ONE STEP USING PSOLA
# 75 and 600 are standard parameters for the # frequency range (hz)used in periodicity analysis; # the last argument is the compression factor
Lengthen (PSOLA)... 75 600 duration_scalar

# RANDOMIZING F0
# this is going to require Manipulation commands, # so we need toselect the sound; let's name it
this_sound$ = selected$ ("Sound")

# grab it + give params for onset of analysis window and
# frequency range used for periodicty analysis
To Manipulation... 0.01 75 600

# pop pitch tier into memory (apparently)
Extract pitch tier

# set random pitch change factor
pitch_scalar = randomUniform(minF0, maxF0)

# modify pitch of the pitch tier in memory
Formula... self * pitch_scalar;

# reselect the sound and replace pitch tier
select Manipulation 'this_sound$'
plus PitchTier 'this_sound$'
Replace pitch tier

# reselect the sound??? and resythesize
select Manipulation 'this_sound$'
Get resynthesis (PSOLA)

# SAVE RESULTING FILES
# create rounded versions of the pitch and # duration factors for use in filename
pitch_scalar_rounded = round(pitch_scalar * resolution)
duration_scalar_rounded = round(duration_scalar * resolution)
filenameDur$ = " 'filename$'" - ".wav" + "_dur" + "_" + string$(duration_scalar_rounded) + "_f0" + "_" + string$(pitch_scalar_rounded) + ".wav"

# update user and write to file
appendInfoLine: "* writing to " + outputDir$ + filenameDur$Write to WAV file: outputDir$ + filenameDur$
select Strings list

endfor

select all
Remove

# end of script