I’m working on a project to (pop) update a field in PeopleSoft based on a file from an outside source. The goofy part is that the file does not have a student ID (not surprising), and the first 5 characters of the SSN are masked (more surprising). So then how do I do a one-to-one match in order to do the update? What I’ve come up with had a couple of Excel and PS Query tips that I thought were worth sharing. The first thing I did was create some new columns in the Excel file that is the source data. To get just the 4 “good” characters of the SSN, I used the RIGHT function.
Equation: =RIGHT(B4,4)
Translation: Give me the last 4 characters from the string (text) in cell B4
Example: ***-**-1234 = 1234
The name in the source data is listed in this format, Last (comma) (space) first (space) middle initial. To get just the last name from that string I first used the FIND function to get which character position the “,” is in.
Equation: =FIND(“,”,C4,1)
Translation: This looks weird because it is a comma that I’m looking for. The syntax is FIND then what you are looking for in quotes, what cell to look in, and what character position to start looking from. So this equation says find the comma in cell C4 and start looking at the first character.
Example: Student, John A = 8 (the comma is the 8th character in the string (name))
Then I use the LEFT function to get the last name.
Equation: =LEFT(C4,G4-1)
Translation: In this example G4 has the results of the Find function. So this says give me the beginning (Left) characters of the string in cell C4 and stop one character before the number G4 (G4 -1).
Example: G4 = 8 (see above) Student, John A = Student (left 7 (8-1) characters in the name)
Next I wanted to use the first three characters in the first name. I only wanted the first three because of the variability of how first names are recorded. I used the MID function for this.
Equation: =MID(C4,G4+2,3)
Translation: MID is like LEFT and RIGHT except it lets you specify a starting point and an ending point. This equations says: starting two characters after the comma (remember G4 holds the position of the comma and a space follows the comma, G4+2) give me the next three characters.
Example: G4 = 8 Student, John A = Joh (three characters starting after the comma and the space)
So, those are fields that I am going to match on in PeopleSoft; the last four of the SSN, the last name, and the first three letters of the first name.
We (thankfully) have a process that allows us to upload a file to PeopleSoft to use in a query. I uploaded this input file and used some criteria with expressions to do the matching from the file to the PEOPLE_SRCH record to get the student ID. The last name field is a simple EQUAL TO criteria then I compare the last four of the SSN in the input file to the expression SUBSTR(B.NATIONAL_ID,-4). The function SUBSTR typically works like the LEFT function in Excel where you specify a field and a number to get the first given number of characters in that text. However (HOT TIP) if you use a negative number, as I did, it will work like the RIGHT function in Excel and give you the last given number of characters in that text.
For the first name comparison I used the expression SUBSTR(B.FIRST_NAME_SRCH,1,3). When used with two number parameters SUBSTR works like Excel’s MID, so this expression gets the first (because the first number parameter is 1) three characters of the first name.
I double checked the results and using these three criteria I am able to get the student ID for those in the input file. While you may never have this same exact situation, hopefully you’ll find parts of this post useful in other projects.
Cheers,
John