Exploring Endangered Species on Wikipedia

by Grant Hinman

 

An image of an African Penguin.

Endangered species need our help. These animals are often neglected, forgotten, or killed by humans. This website aims to investigate how these endangered animals are represented on Wikipedia and how this information can aid in the recovery and protection of these animals.

Wikipedia pages of endangered species and their past editions are the source of this site's data. Each page contains yearly data from the current version that can be found on Wikipedia, dating back to the original page. The pages offer snapshots into how it changes over time. The Wikipedia page "Endangered species" offers different classifications of conservation: extinct, extinct in the wild, critically endangered, endangered, vulnerable, threatened, and least concern. This project analyzes animals classified as endangered. The 48 pages about the animals are ordered alphabetically according to their English name.

There are 15 clusters and they are largely made of the same animal but different species. For example, there is a cluster for cranes, turtles, herons, tigers, wolves, and chimpanzees. Cluster 8 is interesting because it is a collection of four-legged animals that graze grasslands, including a zebra, horse, and buffalo. With a fewer number of clusters, there were not clear groupings like there were with 15. More clusters did not solve the problem with some clusters being a catch-all or random grouping. The clusters are helpful for understanding what type of animals are on the endangered species list.

There are three distinct animal pages that have a painting, rather than a photograph, for the main display image: Australasian bittern, hispid hare, and white-eared night heron. The Australasian bittern is "a secretive bird" that lives in Australia and New Zealand, according to Wikipedia. Although it is not the display image, there is a photograph of the bird on the page, which is unlike the hispid hare and white-eared night heron. Both of these animals are nocturnal and do not have a photograph on their respective Wikipedia pages. One reason for the lack of photographs could be the fact that each of the animals are active (i.e. nocturnal or active at dawn) when people are not. These animals also live away from civilization. Another reason could be the popularity of the animal. Interestingly, each of these pages is shorter and less descriptive relative to the other pages. The bittern, hare, and heron pages only have 140, 179, and 177 words, respectively. The more well-known animals have between 1000 and 1800 words. These pages include the Asian elephant, Bengal tiger, and the red panda. The word count offers an interesting argument for the lack of photographs: people do not care or know that much about these animals. Their pages are short and the animals are not documented with photographs. The page data and images allow us to see which endangered species are more neglected than others.


The painting of an Australasian bittern.

The image displayed for the goliath frog is a photograph of a taxidermy frog, which is not an attractive picture to most readers. Further, this page has few words with a count of only 248. The species faces threats from pet trade, but the Wikipedia page offers little information about the animal, limiting awareness of the issue.


The display image of the goliath frog.

The word count for each Wikipedia page provides key insight into how much available information there is about an animal. The Flores crow, Humblot's heron, and Vietnamese pheasant each have distinctively low word counts. The Flores crow page only has 21 words on the entire page. The page is just two paragraphs, which consist of two sentences each. The first paragraph has not changed from 2007, the year the page was created, and the page has only had one section: References/Sources. Similarly, the Vietnamese pheasant page's first paragraph has changed little since the page's inception in 2004. The paragraph has only been updated by the addition of one sentence in the last 6 years. This page only has 47 words which is split between the first paragraph and the paragraph about its habitat. This section was introduced in 2012 and significantly increased the character count on the page. Finally, the Humblot's heron has only 43 words, but has seen consistent growth in its first paragraph, with the largest increase seen in 2009. Each of these animals are not well-known by people, which is seen by the lack of information on the Wikipedia pages.

On the other hand, the Tasmanian devil page has the most words out of any page on the list with 2382 words. This can be attributed to the fact that the Tasmanian devil has a large cultural presence and is well-known internationally. The page has 10 sections that are about topics such as the animal's description, habitat, conservation status, and relationship with humans. This is a stark difference from the pages with low word counts that only describe the animal in a few sentences. The Tasmanian devil pages even includes information about a legal dispute between Warner Bros. and the Tasmanian government over use of the image of Taz, the Looney Tunes character, for tourism marketing. The second longest page is the famous Bengal tiger. When anyone thinks of a tiger, the image of Bengal tiger comes to the forefront of their mind. Similar to the Tasmanian devil page, the Bengal tiger page contains information about the animal's biology, its habitat, the conservation efforts, and its cultural impact. It is important to see the relationship between the number of words and general knowledge of the animal. We can observe that the more well-known and popular endangered species have more information and longer pages, while the animals with fewer words are also less well-known.


An image of a Tazmanian devil.

Furthermore, the more popular animals have more internal links and languages. These pages are seen at the upper right corner of the graph on "Viz" page and include the snow leopard and the red panda. Closer to the origin are the lesser known animals such as the purple-faced langur, Malagasy pond heron, and toque macaque. Each of these pages have around 400-500 words, which is on the lower end of the spectrum for the studied pages. Ultimately, the more popular the animal, the more information included on the page.

Overall, the pages of endangered species become longer and more informational over time. More users add to, line-edit, and restructure the pages so that they are a more comprehensive resource for the reader. By looking at how these changes occur over time, we are able to discover how information is structured in Wikipedia and what that can mean for the animals' pages. We observed that the more well-known the animal, the longer the Wikipedia page. This is an important relationship to understand because it could lead to further discrimination and neglect of animals that are already struggling. Many of the animals on the endangered species list are threatened by poaching or capture for pet trade. This issue is internationally known in the case of tigers, elephants, and more recently the Indian pangolin. Google had a Valentine's Day game on its home page to spread awareness about the pangolin. People never heard of the animal before or the poaching of it for its scales. But after learning about it, conservation efforts increased and the Indian pangolin is doing better than it was. The lack of information on some of the pages of lesser known animals is a problem because people cannot learn about the animals and find out how they can help. If Wikipedia pages for animals such as the goliath frog are strengthened with images and information about it, perhaps the goliath frog will recover and make it off of the endangered species list. Endangered species need help, and through this website and analysis, it is clear that we can help them with data. Wikipedia is the internet's "Free Encyclopedia", and, in this case, it can serve as the platform for important progress in conservation efforts.