-
Notifications
You must be signed in to change notification settings - Fork 0
Adding New Schools
While we've provided 1,114 schools for you to start with, there are thousands of more schools with scrapable, online bookstores out there. You can easily incorporate them by simply adding them to the database.
The rows you'll need to add for a new school are:
-
Bookstore_Type_Id
-- This points to theBookstore_Types
table. -
StoreFront_URL
-- On which URL are students normally searching for textbooks?- Follets -- Example: Purdue is http://www.bkstr.com/Home/10001-10258-1. We have to link to the homepage instead of the search form, because Follets doesn't allow direct links.
- MBS -- Example: http://bookstore.uwm.edu/SelectTermDept.aspx. Follow this format of bookstoresite.com/SelectTermDept.aspx.
- ePOS -- Example: http://bookstore.edcc.edu/ePOS?store=336&form=shared3%2ftextbooks%2ftext_browse%2ehtml&design=336&campus=MAIN . Just follow the textbook link from the main store page.
- CampusHub -- Example: http://www.fortyninershops.net/buy_main.asp. Just replace the domain. Sometimes they move this page, though, so double check that it's the search form.
- bncollege -- Example: http://iub.bncollege.com/webapp/wcs/stores/servlet/TBWizardView?catalogId=10001&storeId=39052&langId=-1 . Get this URL by clicking the textbook link from the front page.
- Neebo -- Example: http://www.neebo.com/uta . Just the URL from selecting the bookstore from the dropdown list on neebo.com.
-
Fetch_URL
-- The root URL of where our scraper is scraping from.- Follets -- Just always put
http://www.bkstr.com
. - MBS -- Example: http://bookstore.uwm.edu/mobile/ . Replace the (sub)domain of the site for the store you're scraping. Note how we use the mobile site for scraping.
- ePOS -- Example: http://bookstore.edcc.edu/ePOS . Just replace the domain.
- CampusHub -- Example: http://www.fortyninershops.net/buy_courselisting.asp. Replace the domain. If there's a problem, ensure that they haven't changed the search submit page.
- bncollege -- Example: http://iub.bncollege.com/ . Replace "iub" with the specific school subdomain.
- Neebo -- Always
http://www.neebo.com/
. Don't forget the slash at the end.
- Follets -- Just always put
-
Store_Value
-- A bookstore ID value required on Follett and bncollege systems- Follets -- Example: Purdue is
10258
, a value you can extract from theStorefront_URL
above. - bncollege -- Get this from the URL of the textbook search. Example: Indiana University is 39052 (see
StoreFront_URL
above). - other systems -- Set it to
NULL
- Follets -- Example: Purdue is
-
Multiple_Campuses
- A Yes/No value indicating whether multiple stores are on one site. Only used at bncollege.- bncollege -- Some Barnes and Nobles colleges have multiple Campuses searchable from the same site. An example of this is UH, where you can see the Campus dropdown:
http://uh.bncollege.com/webapp/wcs/stores/servlet/TBWizardView?catalogId=10001&storeId=19067&langId=-1
. For these Barnes and Nobles colleges, make sure to indicateY
as the value. - other systems -- Set it to
NULL
- bncollege -- Some Barnes and Nobles colleges have multiple Campuses searchable from the same site. An example of this is UH, where you can see the Campus dropdown:
-
Follett_HEOA_Store_Value
- A special Follett bookstore ID value required to get class_items at follett systems. See the Scraper Documentation for an explanation of how its used- Follett -- The Follett_HEOA_Store_Value is the store_id associated with the given bookstore. You can extract it from the image URL of the logo on that bookstore’s site. Example:
303
is the PurdueFollett_HEOA_Store_Value
. We know this from viewing the source:<img src="http://images.efollett.com/htmlroot/images/templates/storeLogos/CA/303.gif" id="logo">
. - other systems -- Set it to
NULL
- Follett -- The Follett_HEOA_Store_Value is the store_id associated with the given bookstore. You can extract it from the image URL of the logo on that bookstore’s site. Example:
-
Bookstore_ID
-- point it to the ID of the bookstore you just added -
Campus_Value
-- a campus ID value required by Follet and bncollege systems- Follett -- A few Follett stores have multiple campuses. Get the
Campus_Value
by viewing the source and getting the values from the Campus select. Example: At ASU (http://www.bkstr.com/Home/10001-196905-1) the Campus Select has the option for Polytech in<option value="2345">Polytechnic</option>
. - bncollege -- Get it by viewing the source on the textbooks page. For example, at Indiana University you can find the hidden input with the campusID
31379761
<input title="campusid" type="hidden" name="campusId" value='31379761'/>
. Some stores, like UH mentioned previously, have multiple campus values which you can get from the source--you should have a Campus row for each of them. - other systems -- Set it to
NULL
- Follett -- A few Follett stores have multiple campuses. Get the
-
Program_Value
-- Identifier for "Programs" used in Follets system.- Follett -- Example: At Purdue, the
Program_Value
is553
, which we know from viewing the source and looking up the value in the select:<option value="553">Purdue-West Bookstore</option>
. - other systems -- Set it to
NULL
- Follett -- Example: At Purdue, the
-
Location
-- Optional indicator of where the store is located -
Enabled
-- Self-explanatory
Campuses can go by multiple names, for example "IU" and "Indiana University". Indicate the primary one with Is_Primary
. You need a primary Campus_Name
for a school to be enabled.
##Adding Follett_HEOA_Term_Value to the Terms_Cache
The Follet_HEOA_Term_Value
is the hardest part in adding a (Follett) school to the database. It’s set individually by each school, so there are several ways to discover what it is:
- View the source of the school’s class schedule; much of the time the term values you find there will work
- Use Google. If someone has publicly linked to the booklookServlet before, then its
Follett_HEOA_Term_Value
will be in that link. - Try generic/frequent values like “Fall+2011”.
- Many schools use integer values, so you can use wget or curl to iterate through a range like 1-10,000
We test that a Follett_HEOA_Term_Value
is correct by testing it out on the booklook url. To illustrate, the URL with the incorrect Follett_HEOA_Term_Value
at
http://www.bkstr.com/webapp/wcs/stores/servlet/booklookServlet?bookstore_id-1=303&term_id-1=INCORRECT_VALUE
says Unable to find the requested term
. But when we enter a valid Follett_HEOA_Term_Value
it does not give that message:
http://www.bkstr.com/webapp/wcs/stores/servlet/booklookServlet?bookstore_id-1=303&term_id-1=Spring%202011
So, in this case, what we need to do is have the scraper get the Terms for this school (Purdue), and then edit them with the valid Follett_HEOA_Term_Value
s.
We've provided the Berkeley Cal store (Campus_ID
= 101) as an example of a working Follett store, provided in the .sql data file. For all other Follett stores you'll have to set the Follett_HEOA_Term_Value yourself, and keep them up to date as new school terms arrive.