===== Introduction =====
The patent literature contains a large body of chemical information dating back more than two hundred years. In fact, some of the earliest granted patents were for chemistry-related inventions. The very first U.S. patent (US X1) was issued to Samuel Hopkins of Philadelphia on July 31, 1790 for “an improvement, not known or used before such discovery, in the making of Pot Ash and Pearl Ash by a new apparatus and process.” Pot ash (potassium carbonate) and pearl ash (lime) were important ingredients in the production of a number of valuable products including soap and fertilizer. Hopkins also received a patent for his invention in Lower Canada (Quebec) in 1791.
Scientific discoveries and industrialization during the nineteenth century spurred the growth of chemistry-related patents. Notable chemical patents of the period include British chemist William Perkin’s 1856 patent for aniline dye, the first synthetic dye; American John Wesley Hyatt’s 1869 patent for making celluloid, the first artificial plastic; Canadian-American Herbert Henry Dow’s 1892 patent for bromine extraction, one of the first of many chemical processes patented by the Dow Chemical Company; and German chemist Felix Hoffmann’s 1900 patent for acetylsalicylic acid, which was marketed worldwide under the trade name Aspirin.
Table 1. Notable Chemical Patents of the 19th and 20th Centuries
Today, patents continue to be vital to the chemical, pharmaceutical, agricultural, materials, energy and biotechnology industries. Since 1790 the USPTO has issued more than 8 million patents. U.S. patent 8,000,000 was issued on August 16, 2011. The total number of patent documents worldwide is estimated to be more than 70 million. In 2010 inventors filed approximately 1.98 million new patent applications around the world. Of these, approximately 22 percent were chemistry related. (WIPO, 2011)
Most patents of interest to chemists cover compositions of matter (new chemical compounds, mixtures, pharmaceuticals) or processes (e.g., synthesis of a drug). Under the patent laws of some countries it is even possible to patent things such as 3D atomic structures, structural databases, biological sequences, and their uses, which may result from the genomics field. Other types of patents are issued for machines, products, business methods, plants and industrial designs. With hundreds of thousands of patents issued annually by various countries, a great deal of effort is necessary to organize patent documents for effective retrieval and analysis.
Patent offices disseminate patent information in a variety of ways. Historically, this was done by publishing abstracts of issued patents and printed copies of patents. In the U.S. and a few other countries, copies of patents were distributed to academic and public libraries designated as patent depositories. Beginning in the mid-1990s, patent offices utilized the internet to disseminate patent information. In 1994 the USPTO launched the first public patent database on the internet. The European Patent Office (EPO) launched its Espacenet patent database in 1998. Patent offices also sell patent data (at nominal prices) to commercial patent information companies, most notably Thomson Reuters, IFI CLAIMS, LexisNexis, and Questel, that incorporate it into their database products. The internet abounds with free patent databases created by academics, librarians, entrepreneurs, collectors, and patent enthusiasts. Some of the most notable include Google Patents, FreePatentsOnline and Patent Lens.
Patents are covered in many chemical literature abstracting and indexing services. One of the most important of these is Chemical Abstracts, which is published by the Chemical Abstracts Service, a division of the American Chemical Society. The online version of CA is called SciFinder. CA/SciFinder currently includes patents from more than 60 countries. Approximately 18 percent of the documents indexed in CA/SciFinder are patents.
===== What is a Patent? =====
A PATENT is a grant of property rights to an inventor by a government for a limited time, generally 20 years. Patents may be sold, licensed, transferred or bequeathed. Patent rights are exclusionary in nature. Patent owners have the right to exclude others from making, using, importing or selling the invention in the country (or countries) where the patent was issued. The prohibition on importation extends to products made by a patented process. Patents are part of a group of materials generally referred to as INTELLECTUAL PROPERTY that also includes copyrights, trademarks, designs, trade secrets, new plant varieties and traditional knowledge.
To obtain a patent, the inventor must file certain documents, and the invention itself must exhibit the qualities of NOVELTY, UTILITY (usefulness), and INVENTION (unobviousness, ingenuity), defined as follows:
Novelty: The concept that the claimed invention must be totally new. The invention must never have been made public in any way, anywhere in the world, before the date on which the patent application is filed. In the U.S. prior to 2011 this was determined by the date of invention.Utility: The invention has some practical utility, and is fit for some practical, desirable or commercial purpose. For a chemical, utility might mean that it shows a beneficial property, such as a pharmacological effect, or it might be an intermediate that is used in synthesizing a product that has an end use.Invention/ingenuity: The invention must not be obvious to an observer who is "skilled in the art". This assumes that the claims defining an invention in a patent application must involve an inventive step that, when compared with what is already known (i.e., PRIOR ART), would not be obvious to someone who is an expert in that field.In addition, the invention must be disclosed in a clear and complete manner in the patent application. There can be no secret ingredients or information withheld from the application. This is known as the DISCLOSURE requirement.
Patent searching is often undertaken in order to prove novelty, either prior to filing a patent application or during the prosecution of a patent application. This process is known as PRIOR ART SEARCHING or PATENTABILITY SEARCHING. In this case, the older patent literature is quite important. A second type of patent search involves INFRINGEMENT, i.e., trying to determine whether someone else is illegally claiming the rights to an invention that is yours. In this case, the search must be exhaustive, but is limited to the last 20 years or so. VALIDITY searches are conducted in order to locate prior art that invalidates one or more claims in a published application or issued patent. CLEARENCE searches (also known as “freedom-to-operate” or “right-to-use”) are conducted prior to using a process or manufacturing a product that might be patented by another party. STATE-OF-THE-ART searches are comprehensive searches of the patent and non-patent literature conducted to determine the current state of development of a specific technology or technical field.
The inventor is the PATENTEE, and the inventor may assign the patent protection rights to another person or company, the ASSIGNEE. The inventor submits the first patent application on a certain date known as the PRIORITY APPLICATION DATE. Under the "Paris Convention for the Protection of Intellectual Property of 1883," the priority date is also considered to be the date a patent application is filed in any country that has signed the Paris Convention (as long as the inventor files the application in the other country within 12 months of the priority date). This results in a PATENT FAMILY of publications related to the invention. Some of these may be patents (perhaps in a language that is easier to read than that of the original), whereas others merely document the invention disclosed by the applicant as of the priority date. Regional patenting bodies, such as the European Patent Office, issue patents for groups of countries, and the Patent Cooperation Treaty permits a single filing to initiate the patenting process in a number of countries. The PCT now provides for filing in over 100 countries. The importance of the priority date becomes very obvious when cases arise where two companies file for the same invention at about the same time. Such was the case when ICI Ltd. filed EP 399731 with a priority date of 23 May 1989, and Merck & Co. filed EP 400974 on 30 May 1989.
===== What is Not Patentable? =====
Patents will not be granted for an invention that has already been publicly disclosed in another patent application or article, or through public use or sale. Even a posting on the internet can negate the criterion of novelty. In the U.S., an inventor cannot obtain patent protection if an article is published about the invention more than one year before the filing is made. In other parts of the world the patent would be invalidated on the first day of publication of the article. Much of the information in the patent literature is, in fact, never published in any other format. However, some chemists denigrate patents as information sources since the titles, descriptions, and claims tend to use general, broad terminology, rather than the precise wording typically found in journal articles or other forms of primary scientific literature.
In general, patents are not granted for discoveries, scientific theories and naturally occurring substances. Inventions that are deemed to be contrary to the public good (terrorist devices) or having national security implications (nuclear weapons) may be unpatentable. Some countries restrict or prohibit patenting of diagnostic, therapeutic and surgical methods of treatment for humans or animals.
===== What is a Patent Specification? =====
Patent searching is complex, and a basic understanding of the patenting process is necessary in order to comprehend the different types of documents involved. Most countries publish patent applications 18 months after the filing or priority date. In addition, a second patent document may be issued after the patent is granted. Other types of patent documents include REISSUE PATENTS, DEFENSIVE PUBLICATIONS, STATUTORY INVENTION REGISTRATIONS, CERTIFICATES OF CORRECTION, and CERTIFICATES OF REXAMINATION.
Some countries permit inventors to file PROVISIONAL PATENT APPLICATIONS. Provisional applications are not examined or published but may be cited in a later patent application. When the inventor applies for a patent, the PATENT SPECIFICATION must be submitted. The specification is a technical document that contains a description of the invention. A typical patent document will include drawings, background of the invention, a summary and a detailed description of the invention, examples, and one or more CLAIMS that define what is legally covered by the invention. The FRONT PAGE of a published patent specification contains bibliographic information, an abstract and selected drawing.
Published patent documents are assigned unique identification numbers. The standard format for a patent number is the two-letter COUNTRY CODE followed by the SERIAL NUMBER followed by a one- or two-letter KIND CODE. For example, US 7,000,000 B1 is the number of a U.S. patent and CA 2683867 A1 is the number of a Canadian published application. Older patent numbers may not have country or kind codes. Country codes are fairly obvious, e.g. US for United States, CN for China, GB for Great Britain, DE for Germany, JP for Japan, etc. The country code for patent numbers assigned by the European Patent Office is EP; WO is reserved for PCT applications published by the WIPO. Kind codes indicate the type of document. The code “A” is for published applications (first-stage publications) and search reports; “B” and “C” are reserved for issued patents. Other kind codes include “S” for U.S. design patents and “P” for U.S. plant patents. Numbering systems vary by patent office. Some patent offices use a continuous number system while others use a year-number system; some offices use both systems. For example, the USPTO began numbering patents in 1836; patent 1 was issued on July 13 of that year. This series continues up to the present day, with patent 8,087,093 being issued on December 27, 2011. When the USPTO began publishing applications in 2001, it decided to number them using a year-number format. The number assigned to the first application published each year from 2001 forward is 2001/0000001, 2002/0000001, 2003/0000001, 2004/0000001, etc. In contrast, the Canadian Intellectual Property Office assigns one number to each application and simply updates the country code at each stage, e.g. CA 2,258,975 A1 for the published application and CA 2,258,975 C for the issued patent.
Most countries allow the inventor to define the legal limits of the patent in the claims in both generic terms and specific terms. For chemical patents, generic inventions usually take the form of a MARKUSH STRUCTURE, a structure that contains one or more structural variables based on a list of stated alternatives. Each compound that could be constructed from the list is covered by the claims. Click here for an example of a very long Markush Structure.
===== How Long Does a Patent Last? =====
In the past there was a great deal of variation in the terms of protection afforded by patents issued in different countries. In the early 1990s, members of the World Trade Organization (formerly the General Agreement on Tariffs and Trade) agreed to recognize patents in all fields of technology for a 20-year period that begins with date of the first patent application filed in any country, also known as the PRIORITY DATE. On June 8, 1995, the new term took effect in the U.S. Prior to that date U.S. patents were issued for a term of 17 years from the date of issue. Design patents, which protect original ornamental designs, and utility models, which protect minor improvements, have much shorter terms, generally 5-14 years.
===== Classification of Patents =====
Patent classification systems provide a means to organize, store and retrieve patent documents efficiently and effectively. There are several patent classification systems in use today, all of which are organized around technical subject matter. The most widely used is the International Patent Classification (IPC). IPC is a hierarchical classification that consists of eight sections, labeled A through H. Section C covers chemistry and metallurgy. Each section is further subdivided into classes, subclasses, and groups. There are approximately 70,000 IPC codes. Chemical compounds are classified in section C according to their chemical structures. For example, C07C 31/18 is the IPC code for polyhydroxylic acyclic alcohols. Chemical processes and apparatus are located in class B01. Pharmaceutical compounds are classified in subclass A61K, Preparations for Medicinal, Dental, or Toilet Purposes.
The European Classification (ECLA), which is maintained by the EPO, is based on the IPC and uses many of the same codes. However, ECLA has about 140,000 codes, almost twice as many as the IPC. In 2010, the EPO and USPTO announced that ECLA will be the basis of a new Cooperative Patent Classification (CPC) to be implemented in 2012.
The U.S. Patent Classification (USPC) is used only by the USPTO. The USPC consists of approximately 450 classes and 150,000 subclasses. There are dozens of USPC classes relating to chemistry. The three largest in terms of total patents are Class 520-528, Synthetic Resins or Natural Rubbers, which has 217,166 patents; Class 532-570, Organic Compounds, with 231,160 patents; and Classes 424 and 514, Drug, Bio-Affecting and Body Treating Compositions, with 175,826 patents. The USPC will be phased out in 2012 in favor of a new classification system jointly developed by the EPO and USPTO.
===== Abstracting and Indexing of Patents: Major Patent Databases =====
Patent searchers have many options when it comes to searching patent databases. There are dozens of public and fee-based patent databases on the web, plus many more abstracting and indexing (A&I) databases that index the patent literature. A complete survey of all patent search systems is not possible here. Two excellent sources of information about patent databases are the Intellogist and Patent Information Users Group wiki.
Since patents for a particular invention may appear at different times in a number of countries, abstracting and indexing services generally have adopted the practice of abstracting only the first patent issued, called the BASIC PATENT. Later patents in the patent family are indexed as EQUIVALENT PATENTS. Database producers are not always in agreement when it comes to a definition of basic and equivalent patents. On STN, members of a patent family have a common priority application number and date. Further complicating the patent family situation are the patent types that may result in related patents, namely:
Division: results from a decision by the patent office that the claims are too broad for a single patent. The application is then split into a parent and divisional applications, each claiming a different invention
Continuation: results when a second and subsequent applications are filed when the original application is pending
Continuation-in-Part (CIP): results from a second or subsequent application being filed that includes new material even though the original application is pending.USPTO Patent Databases
The USPTO was the first patent office to use the internet to disseminate patent information, launching its first patent database in 1994. Today the USPTO website hosts two patent databases, one for issued patents (PatFT) and one for published applications (AppFT). PatFT is updated weekly on Tuesdays and AppFT is updated weekly on Thursdays. Full-text searching is available for patents issued from 1976 forward and published applications from 2001 forward. Prior to 1976, patents may be retrieved by number, date of issue and current USPC or IPC classification. Patent document images are stored in TIFF format and require a TIFF viewer to display and print. Searching by chemical structure, formula or registry number is not available. The USPTO website also hosts a trademark database that is useful for locating chemical trade names; a patent and trademark assignments database that contains records of assignment changes; and PAIR, Patent Application Information Retrieval, which contains patent prosecution histories, legal status information and file wrapper documents.
PATENTSCOPE is the World Intellectual Property Organization (WIPO) free patent database. It contains all (around 3.5. millions) published international applications (PCT) and almost 75 million patent documents from participating national/regional Offices, including Brazil, China, European Patent Office, India, Japan, Republic of Korea, United States, etc. and more to come. PATENTSCOPE offers chemical structure and substructure search, an AI-based trained in-house machine translation tool called WIPOTranslate, as well as proximity searching and full-text searching.
FreePatentsOnline is a free patent database owned by SumoBrain Solutions Co. It currently offers full-text searching of USPTO, EPO, WIPO/PCT documents from the mid-1970s forward. Patent abstracts of Japan and non-patent literature collections are also searchable. The U.S. national collection contains about 28,000 records for withdrawn patents that are not included in the USPTO databases. Withdrawn patents are applications that have been approved by the USPTO but withdrawn by the applicant (or USPTO) prior to issue. US, EP and WO documents are stored in PDF format. Full-text searching is not available prior to 1976. US patents from 1836 to 1975 are available in PDF and may be retrieved by number, date of issue and classification. Searching by chemical structure, formula or registry number is not available.
Patent Lens is a free patent database produced by Cambai, an independent, non-profit institute located in Brisbane, Australia. Patent Lens currently contains approximately 11 million full-text patents from Australia (1998-present forward), EPO (1980-present), WIPO/PCT (1978-present) and the USPTO (1976-present). Searching by chemical structure, formula or registry number is not available. However, it is possible to search for biological sequences listed in U.S. patent documents using NCBI’s Blast software.
ChemSpider is an award-winning free chemical information search engine owned by the Royal Society of Chemistry. It provides access to over 25 million chemical structures, properties and other information from more than 400 vetted data sources. Search options include systematic name, synonym, trade name, registry number, SMILES, InChI, CSID, structure and properties. Patent data is sourced from Google Patents and SureChem, a proprietary chemical patent database. Patent coverage includes USPTO, EPO, WIPO/PCT patent documents and Japanese patent abstracts from the late 1970s forward. A subscription to SureChem is required to see more than the first three patents retrieved in a search result.
Reaxys is a web-based chemistry database launched by Elsevier in 2009 as the successor to the CrossFire Commander client-based search system. (Elsevier discontinued support for Crossfire at the end of 2010.) Reaxys contains experimental substance and reaction data published in the chemical literature dating back to the 1770s. Much of the historical data in Reaxys was first published beginning in the late 1800s in the Beilstein and Gmelin handbook series. Patent literature from several countries was covered in Beilstein and Gmelin from the early 1900s to around 1980. Current patent coverage in Reaxys is limited to English-language EP, US and WO patent documents in International Patent Classification classes C07 (Organic Chemistry), A61K (Medicinal, Dental, Cosmetic Preparations) and C09B (Dyes). A maximum of 500 compounds per patent are indexed. The table below shows patent coverage for selected countries and patent offices.
Searchable patent bibliographic data in the form-based search includes inventor name, assignee name, patent number, country and publication year. In the advanced search form, patent search fields include:
Date of Publication (PBIB.PD)
Application Number (PBIB.AP)
Date of Filing (PBIB.FD)
Manually Excerpted (PBIB.MAN)
Priority Number (PBIB.PRNR)
Priority Date (PBIB.PRD)
Main IPC (PBIB.ICM)
Secondary IPC (PBIB.ICS)
Family Member: Patent Number (PBIB.FPN)
Family Member: Status Code (PBIB.FPN)
Family Member: Data of Publication (PBIB.FPD)
Family Member: Application Number (PBIB.FAP)
Family Member: Date of Filing (PBIB.FFD)
Family Member: Indexed Patent (PBIB.FIDX)Other searchable patent-specific data fields include:
Prophetic Compound (PSD.PRC)
Related Markush Structure (PSD.MARPRN)
Location in Patent (PSD.LCN)The Elsevier TrainingDesk website includes many helpful tutorials on searching Reaxys.
PubChem is a free chemical search system that provides information on the biological activities of small molecules. It consists of three linked databases, PubChem Substance, PubChem Compound, and PubChem BioAssay. PubChem is operated by the U.S. National Library of Medicine. It covers about 30 million compounds. Many compounds are linked to selected patents.
Cippix covers USPTO, EPO, WIPO/PCT, CA, DE patent documents and Japanese full text documents. Cippix extracts chemical entities automatically from English, German, French, and Japanese sources with weekly updates. Cippix provides advanced proximity searches on compound and document level. Subscription to Cippix is required and can be purchased online for immediate access.
INPADOC, the International Patent Documentation Center, covers about 60 country patent offices as well as international patent bodies, such as WIPO and EPO (the European Patent Office). Therefore, the INPADOC database is a major source of patent family data and forms the basis for Chemical Abstracts Service's printed patent indexes, and now the data found in the CA and CAPlus files.
IFI CLAIMS files cover U.S. chemical patents from 1950.
Derwent, now a Thomson Reuters company, developed one of the world's largest patent services, covering over 30 countries plus the European Patent Office and Patent Cooperation Treaty countries. The WPI (World Patents Index) database covers pharmaceuticals from 1963 and other categories of chemicals for the periods shown below:
Agricultural chemicals, 1965-
Plastics and polymers, 1966-
All other chemistry, 1970-Derwent also produced the Derwent Patents Citation Index covering US Examiner's citations from 1984 and EPO and PCT Examiner citations from 1978.
See a sample search of the WPINDEX file and STN's Database Summary Sheet for WPINDEX.
Questel offers Markush structure searching in the WPIM and PHARMSEARCH files. PHARMSEARCH covers US and European pharmaceutical patents from 1984 to the present, and WPIM has all of the Derwent chemical and pharmaceutical patents. The Merged Markush Service is a structure file produced by INPI (the French Patent and Trademark Office) and Derwent that can be searched by generic structure input. MMS includes all of the structures previously available in the MPHARM (Markush PHARMSEARCH) and DWPIM (Derwent World Patents Index Markush) files. MMS also includes all the compounds from the Derwent Chemical Resource (DCR). A new database on Questel is PlusPat, a source of worldwide patent data that extends back into the 19th century. PlusPat merges the European Patent Office (EPO) DOCDB file, (which is the basis for EPO databases - including Esp@cenet), with information from additional Questel databases.
===== CAS Patent Databases on STN =====
Chemical Abstracts Service's coverage of patents began with the print CA in 1907. Enhanced coverage of patents was initiated around 1960. In 1999, CAS began to include patent family information in its CAplus file. There are two types of patent family relationships in CAplus.
Closely related family members: have a simple priority application relationship. These are usually included in a single record in CAplus.
Extended patent family members: result from complex relationships.
Multiple, but at least one, common priority applications from different countries
Relationships resulting from division, addition, continuation, or continuation-in-part patentsAdditional records may be created by CAS in order to capture any new information that is included in these family members. Patent coverage is extremely fast in CAplus now, with most chemical patents making it into the database within two days of issue.
STN now has a Markush database in the MARPAT file (1988- ). MARPAT provides access by structure to all the specific and generic substances claimed in patents via Markush structures from 1988 to the present.
USPATFULL has the full text of all patents issued by the USPTO from 1975 (with partial coverage from 1971-74). The US Patent Office began to publish US patent applications on March 15, 2001, so the USPATFULL database now includes those documents. They are distinguished from the granted patents by their kind code, A1. CAS indexes chemical patents in USPATFULL, including CAS Registry Numbers. See the example for the USPATFULL file on STN, which has a thesaurus for the USPTO Manual of Classifications and for the WIPO IPC.
The USPATOLD (U.S. Patents Pre-1976) database includes more than 3.5 million records, namely, the full text of patents issued by the USPTO from 1790 through 1975. These records were created from original U.S. patent documents that were converted into electronic form through an optical character recognition process. Since OCR is not 100% accurate, some records may include misinterpreted characters, or portions of the patent text may be missing. To enhance their retrieval, approximately 500,000 USPATOLD records that are also covered by CAplus were supplemented with CAS data.
===== Examples of Chemical Patents =====
Note the increase in the US patent numbers issued during the years covered by the table below.
The CA abstract for the last patent above is reproduced below: