Chemical names generated by chemists can suffer from quality issues. Systematic nomenclature is not easy and software can definitely help…read this paper for example. If systematic names are not correct then either name to structure conversion algorithms or dictionary look-ups may fail. As we work to provide document markup capability and linkage to chemical structures contained within chemistry documents we need to account for these issues with naming.

As an example, see this online publication where even the same name is inconsistent. See below…

AN_interface is now available to edit the name “under the text” and therefore allow it to be converted OR to link it directly to a chemical on ChemSpider. In this way we are not adulterating the original text but now an “incorrect name” or name that is not associated with a structure for display is now available.

The input is easy…click edit in the structure balloon and then either edit the name or link to a CSID. Also, through this dialog you can declare it as a primary compound, secondary or catalyst. We can of course change this/add to this to be reactant, product, solvent etc.

Following the association/correction in this way the CORRECTed name shows up in the Balloon and the resulting structure is shown (and matches what’s in the article). One more step to optimal document markup.

Reblog this post [with Zemanta]

Buy me a Coffee

We are adding additional dictionaries to ChemMantis to support linking to external information. Wikipedia is a rich source of information for chemists and we have chosen to connect out to Wikipedia for details about named reactions. Now, when a person marks up a document and highlights a particular named reaction then the link to Wikipedia is used to populate the information balloon on the article. An example is shown on this article on ChemMantis and the balloon is shown below for the Knoevenagel condensation.

Buy me a Coffee

As we work on ChemMantis it is clear that we want to expand the integration out to external sources of information as much as possible rather than limit the connectivities to the ChemSpider platform. We have started to build the necessary dictionaries to support bacteria, fungi, viruses etc so it makes sense to connect these up to external resources. As a proof of concept we are using Wikipedia sources to directly feed the “Species Balloons” and have enabled searching of Wikipedia, Google and Entrex directly from the balloon. As an example of the integration we see below the species balloon filled with the leed of the article from Wikipedia for Zymomonas mobilis(click on the thumbnail)

 

From the balloon it is possible to search across Entrez, Google and directly into Wikipedia for more information. For this particular bacterium Entrez gives a list of results as shown below (click on the thumbnail). We are using a similar approach with elements now. Rather than show a “bare element” in a structure balloon (who needs to see Li for Lithium?) we will display the leed text from Wikipedia for that element. The near future will likely see us link to Uniprot and PDB for proteins and out to similar rich sources for other species.

Buy me a Coffee

All reports about ChemMantis that I have reported to date have emphasized that ChemMantis only works in Internet Explorer. However, thanks to a comment from Soaring Bear, a member of our Advisory Group, I’m now looking at documents marked up with ChemMantis using the “IE Tab Add-on“. Details can be found here.This is an interim solution until we have direct support in Firefox.

Buy me a Coffee

We are progressing quite well with our development of Chemmantis, the document markup system for Chemistry-related documents. One of the problems with marking up various types of chemical entity is how “colorful” they can become as you markup a document. Some examples of markup we are considering are shown in the image. Not all of the supporting dictionaries are in place yet but are under development.

If you consider a standard chemistry document with the number of chemicals, reactions, elements and groups that can show up in just a paragraph you can end up with the “Christmas Tree effect”. An example of the effect is shown below. There are actually only three colors/effects shown - mark up of chemical names, mark up of elements and mark up of chemical names with no associated structure…in this case Bis-pi-allylnickel.

 

So, all elements are marked separately and, using the check box capability in the floating window, can be switched on and off. believe me, you need this. The words carbon, oxygen, nitrogen and hydrogen, for example, show clearly will up a lot in chemistry articles (hydrogen bond, carbon-carbon bond etc). So, there are many cases where you would just want to switch the element view off. One check box and its done. the same is true for chemical names, species and, shortly, chemical groups and reaction types. We are also working on splitting out bacteria, fungi, viruses etc.

If you want to see where we are at present I encourage you to look at this IUPAC Pure and Applied Chemistry article entitled : Carbon-Carbon Bond Forming Reactions Using Alkyl Fluorides. The markup does not yet work in any browser other than Internet Explorer so try it there. Hover over the chemical names and you should see the chemical balloon show up as shown below. In the next few days we hope to roll out the connections out to related data from this structure balloon. Watch this space.

Buy me a Coffee

I’ve posted a new presentation regarding ChemSpider/ChemMantis on Slideshare. The first part is the usual ChemSpider intro stuff. For ChemMantis start at slide 39…

Note that we’ve now started expanding the handling of “Species” handling by adding specific dictionaries. We’ll be adding support for fungi, bacteria, viruses etc. See slide 75 for a screenshot.

Buy me a Coffee

The ChemSpider Forum is starting to be used by users to resource information from the ChemSpider community. Not everyone reads the forum so i am trying to help the person who posted this request by exposing it to the readers of this blog in case you can help…

Looking for a surfactant

I own a small business, Superior Marker Company, and we make a unique type of blue chalk that’s used to mark the Bible, Blueprints and some Textbooks. The chalk erases from the paper, without harming it or leaving a permanent mark. Necessary to our process is a surfactant listed above, with two CAS Numbers, namely 68649-05-8 and 12676-21-0. One trade name for this product is “Armeen Z”, but it is no longer manufactured by Akzo Nobel. I’m scouring the globe for this product so if anyone knows where it can be found or if someone can synthesize it

Please visit the forum and respond if you can help: http://forum.chemspider.com/Default.aspx?g=posts&t=151

Buy me a Coffee

I previously posted a YouTube video of ChemMantis, our chemistry document markup system in action.While it is indicative of how the system works the detail is lost in the resolution of the video and there have been a number of requests for a higher resolution version. I’ve created a copy of the movie in Quicktime format and it can be downloaded from Mediafire here.

All feedback welcomed…

Buy me a Coffee

For those of you performing curation activities on ChemSpider you will likely have noticed the ability to mark a new type of identifier, a shorthand formula. We have enabled this because it has become clear that this could be a useful part of document markup as part of our ChemMantis system. For example, looking at an article let’s consider the excerpt shown below.

Regarding the excerpt you can see a number of highlighted terms, all being shorthand formulae and not depending on name to structure conversion algorithms but rather depending on a lookup dictionary. Each of these names are linked to ChemSpider for direct look up of information associated with the chemicals. The list of shorthand formulae extracted from a couple of hundred articles is actually only a couple of hundred formulae at present. It includes the most obvious compounds that we can all interpret: CH3OH, MeOH, CH3CN, MeCN, CH3COOH, NaCl, NaF, NaCN, KBr, KCl and so on. All of these are immediately interpretable by chemists. There are likely a few more to be found over the coming months but in the past week of reviewing articles from various sources we have actually only added a couple of new formulae. We have also seen value in linking up ions and elements as appropriate. We are likely to add filters for display/not display of elements and ions since we’re of the opinion that displaying every incidence of an element in an article is of luttle value…just imagine how many times you might see the word carbon or hydrogen in an article… carbon-carbon bonds, hydrogen bonding etc. So, we’re switching them off by default. We’ll keep reporting on how we are improving ChemMantis…based on the review of a stack of articles the system has improved dramatically. We are asking for your articles now…combining shorthand formulae and chemical name markup will highlight a document as shown below.

Buy me a Coffee

ChemMantis is now in alpha release and under tests. ChemMantis is our Chemistry Markup And Nomenclature Transformation Integrated System. The movie below can likely tell a better story than I can write. So, let’s start with this movie…and more will follow. The premise is upload a document, find chemical names, convert names/identifiers to chemical structures and find related information. In this case we are demonstrating how structures are linked to information on ChemSpider and from there out to other information on the web. There are more such displays to come….

Buy me a Coffee

I’ve posted over at the ChemConnector Blog about the potential need for a neutral review of the performance of Optical Structure Recognition algorithms. I’m interested in the technology because we are now using it on ChemSpider for our document markup and structure recogition. I’d welcome your thoughts and comments…visit the blog post.

Buy me a Coffee

There’s no shortage of possibilities regarding where we could go next with ChemSpider and we’re always thinking ahead. At present we are focused on chemistry document markup and the development of ChemMantis. Moving forward we are considering how chemists might want to use ChemSpider. Based on comments from organic chemists over the past few months a lot of chemists are using ChemSpider to source chemicals for purchase for screening and specifically to find starting materials for further reactions.

Recently we added the ChemSynthesis structure collection. That database offers links out to over 45,000 articles regarding reaction synthesis. We are now being encouraged to manage reactions directly on ChemSpider. While we of course have the skills to do so it’s not in our near future. But, what if we did?  Then retrosynthetic analysis might be possible. At the ACS meeting in Philadelphia in August I gave a presentation on ARChem Route Designer, a software product marketed by SimBioSys . It was my privilege to give this presentation on behalf of one of the most respected chemists, Peter Johnson, someone who has been at the forefront of tools for synthesis design and structure based drug design. Take a look at the presentation about ARChem…for chemists interested in software tools for Retrosynthetic Analysis it may be of interest…and I wonder whether a platform like this might be of interest to integrate to ChemSpider…what do YOU think????

Buy me a Coffee

Jean-Claude Bradley, our collaborator at Drexel University, recently posted on “There are no facts…in science - only measurement embedded within assumptions.” He refers to information on ChemSpider a number of times to make his arguments and I point you to his original post to read.

Some specific sections are quoted “There are properties that have been determined so many times by different researchers and different techniques that we can treat a narrow range of values by consensus as if they were absolute facts. An example would be considering the boiling point of methanol at 1 atm to be 65C within one degree of accuracy. For most purposes that will suffice, as long as we understand the source of our confidence.”

When we deposit property information onto ChemSpider we make attributions with the outlinks. So, if you look at this record for ethyl acetate you will see a lot of property informtion listed as shown below. Unfortunately the “units” are not always directly available when we gather the data and we need to add the ability to add/edit units soon. However, there IS generally information in the record for at least one of the entries defining the units and the outlinks (shown by the blue arrows) will take the user to the original data source anyway.

  • experimental physchem properties
    • Melting Point: -84

    • Melting Point: -84

    • Melting Point: -84

    • Melting Point: -84

    • Melting Point: -84

    • Melting Point: -84 C

    • Boiling Point: 76-77

    • Boiling Point: 77

    • Boiling Point: 77

    • Boiling Point: 77

    • Boiling Point: 77

    • Boiling Point: 171F

    • Boiling Point: 77º

    • Boiling Point: 77 C

    • Flash Point: -3(26F)

    • Flash Point: -3(26F)

    • Flash Point: -3(26F)

    • Flash Point: -3(26F)

    • Flash Point: -3(26F)

    • Flash Point: 24F

    • Flash Point: -4 C

    • Freezing Point: -117F

    • Specific Gravity: 0.902

    • Specific Gravity: 0.902

    • Specific Gravity: 0.902

    • Specific Gravity: 0.902

    • Specific Gravity: 0.902

    • Specific Gravity: 0.90

    • Specific Gravity: 0.894 - 0.898

    • Refraction Index: 1.3720

    • Refraction Index: 1.3720

    • Refraction Index: 1.3720

    • Refraction Index: 1.3720

    • Refraction Index: 1.3720

    • Refraction Index: 1.371 - 1.376

    • Ionization Potential: 10.01 eV

    • Vapor Pressure: 73 mmHg

  • miscellaneous
    • Appearance: Colorless liquid with an ether-like, fruity odor.

    • Appearance: colourless liquid with fruit-like odour

    • Appearance: Colourless liquid, volatile at low temperatures with a fragrant, acetic, ethereal odour

    • Applications: Pesticide residue, environmental, and GC analysis