Skip to content
bsgreenb edited this page Feb 13, 2012 · 9 revisions

While we've provided 1,114 schools for you to start with, there are thousands of more schools with scrapable, online bookstores out there. You can easily incorporate them by simply adding them to the database.

The rows you'll need to add for a new school are:

The Bookstore row

  • Bookstore_Type_Id -- This points to the Bookstore_Types table.
  • StoreFront_URL -- On which URL are students normally searching for textbooks?
  • Fetch_URL -- The root URL of where our scraper is scraping from.
  • Store_Value -- A bookstore ID value required on Follett and bncollege systems
    • Follets -- Example: Purdue is 10258, a value you can extract from the Storefront_URL above.
    • bncollege -- Get this from the URL of the textbook search. Example: Indiana University is 39052 (see StoreFront_URL above).
    • other systems -- Set it to NULL
  • Multiple_Campuses - A Yes/No value indicating whether multiple stores are on one site. Only used at bncollege.
    • bncollege -- Some Barnes and Nobles colleges have multiple Campuses searchable from the same site. An example of this is UH, where you can see the Campus dropdown: http://uh.bncollege.com/webapp/wcs/stores/servlet/TBWizardView?catalogId=10001&storeId=19067&langId=-1. For these Barnes and Nobles colleges, make sure to indicate Y as the value.
    • other systems -- Set it to NULL
  • Follett_HEOA_Store_Value - A special Follett bookstore ID value required to get class_items at follett systems. See the Scraper Documentation for an explanation of how its used
    • Follett -- The Follett_HEOA_Store_Value is the store_id associated with the given bookstore. You can extract it from the image URL of the logo on that bookstore’s site. Example: 303 is the Purdue Follett_HEOA_Store_Value. We know this from viewing the source: <img src="http://images.efollett.com/htmlroot/images/templates/storeLogos/CA/303.gif" id="logo">.
    • other systems -- Set it to NULL

The Campus row

  • Bookstore_ID -- point it to the ID of the bookstore you just added
  • Campus_Value -- a campus ID value required by Follet and bncollege systems
    • Follett -- A few Follett stores have multiple campuses. Get the Campus_Value by viewing the source and getting the values from the Campus select. Example: At ASU (http://www.bkstr.com/Home/10001-196905-1) the Campus Select has the option for Polytech in <option value="2345">Polytechnic</option>.
    • bncollege -- Get it by viewing the source on the textbooks page. For example, at Indiana University you can find the hidden input with the campusID 31379761 <input title="campusid" type="hidden" name="campusId" value='31379761'/> . Some stores, like UH mentioned previously, have multiple campus values which you can get from the source--you should have a Campus row for each of them.
    • other systems -- Set it to NULL
  • Program_Value -- Identifier for "Programs" used in Follets system.
    • Follett -- Example: At Purdue, the Program_Value is 553, which we know from viewing the source and looking up the value in the select: <option value="553">Purdue-West Bookstore</option>.
    • other systems -- Set it to NULL
  • Location -- Optional indicator of where the store is located
  • Enabled -- Self-explanatory

Campus_Names

Campuses can go by multiple names, for example "IU" and "Indiana University". Indicate the primary one with Is_Primary. You need a primary Campus_Name for a school to be enabled.

##Adding Follett_HEOA_Term_Value to the Terms_Cache

The Follet_HEOA_Term_Value is the hardest part in adding a (Follett) school to the database. It’s set individually by each school, so there are several ways to discover what it is:

  • View the source of the school’s class schedule; much of the time the term values you find there will work
  • Use Google. If someone has publicly linked to the booklookServlet before, then its Follett_HEOA_Term_Value will be in that link.
  • Try generic/frequent values like “Fall+2011”.
  • Many schools use integer values, so you can use wget or curl to iterate through a range like 1-10,000

We test that a Follett_HEOA_Term_Value is correct by testing it out on the booklook url. To illustrate, the URL with the incorrect Follett_HEOA_Term_Value at

http://www.bkstr.com/webapp/wcs/stores/servlet/booklookServlet?bookstore_id-1=303&term_id-1=INCORRECT_VALUE

says Unable to find the requested term. But when we enter a valid Follett_HEOA_Term_Value it does not give that message:

http://www.bkstr.com/webapp/wcs/stores/servlet/booklookServlet?bookstore_id-1=303&term_id-1=Spring%202011

So, in this case, what we need to do is have the scraper get the Terms for this school (Purdue), and then edit them with the valid Follett_HEOA_Term_Values.

We've provided the Berkeley Cal store (Campus_ID = 101) as an example of a working Follett store, provided in the .sql data file. For all other Follett stores you'll have to set the Follett_HEOA_Term_Value yourself, and keep them up to date as new school terms arrive.

Clone this wiki locally