The text below is selected, press Ctrl+C to copy to your clipboard. (⌘+C on Mac) No line numbers will be copied.
Guest
Python wikipedia module: catch DisambiguationError and pick best option
By Guest on 18th December 2022 08:31:37 AM | Syntax: PYTHON | Views: 207



New Paste New paste | Download Paste Download | Toggle Line Numbers Show/Hide line no. | Copy Paste Copy text to clipboard
  1. If the search term returns multiple pages with similar or related content, the Wikipedia module may raise a DisambiguationError exception. In this case, you can catch the exception and use the options provided in the exception to determine which page to retrieve.
  2.  
  3. Here's an example of how you can catch the DisambiguationError exception and select the best page from the options:
  4.  
  5.  
  6. import wikipedia
  7.  
  8. try:
  9.    # Search for pages that match the search term
  10.    results = wikipedia.search("search term")
  11.  
  12.    # Get the best page from the search results
  13.    page = wikipedia.page(results[0])
  14.  
  15. except wikipedia.DisambiguationError as e:
  16.    # Print the options provided in the exception
  17.    print("Multiple pages were found with the search term:")
  18.    print(e.options)
  19.  
  20.    # Prompt the user to select the best page
  21.    selection = input("Please select the best page: ")
  22.  
  23.    # Get the page for the selected option
  24.    page = wikipedia.page(selection)
  25.  
  26. # Print the page's title and summary
  27. print(page.title)
  28. print(page.summary)
  29.  
  30.  
  31. This code will search for pages that match the search term, and if multiple pages are found, it will catch the DisambiguationError exception and print the options provided in the exception. The user can then select the best page from the options, and the code will retrieve the selected page and print its title and summary.
  32.  
  33. You can also use the options provided in the DisambiguationError exception to automatically select the best page, rather than prompting the user to make a selection. For example, you could use a simple heuristic to select the page with the shortest title or the page with the most related terms to the search term.
  34.  
  35.  
  36. ---
  37. Sample error output:
  38.  
  39. In [18]: get_page("DisambiguationError")
  40. /home/codespace/venv/lib/python3.8/site-packages/wikipedia/wikipedia.py:389: GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
  41.  
  42. The code that caused this warning is on line 389 of the file /home/codespace/venv/lib/python3.8/site-packages/wikipedia/wikipedia.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.
  43.  
  44.   lis = BeautifulSoup(html).find_all('li')
  45. ---------------------------------------------------------------------------
  46. DisambiguationError                       Traceback (most recent call last)
  47. Cell In[18], line 1
  48. ----> 1 get_page("DisambiguationError")
  49.  
  50. File /workspaces/devops_test/mylib/wiki.py:14, in get_page(search)
  51.      11 results = wikipedia.search(search)
  52.      13 # Get the best page from the search results
  53. ---> 14 page = wikipedia.page(results[0])
  54.      16 # Print the page's title and summary
  55.      17 print(page.title)
  56.  
  57. File ~/venv/lib/python3.8/site-packages/wikipedia/wikipedia.py:276, in page(title, pageid, auto_suggest, redirect, preload)
  58.     273     except IndexError:
  59.     274       # if there is no suggestion or search results, the page doesn't exist
  60.     275       raise PageError(title)
  61. --> 276   return WikipediaPage(title, redirect=redirect, preload=preload)
  62.     277 elif pageid is not None:
  63.     278   return WikipediaPage(pageid=pageid, preload=preload)
  64.  
  65. File ~/venv/lib/python3.8/site-packages/wikipedia/wikipedia.py:299, in WikipediaPage.__init__(self, title, pageid, redirect, preload, original_title)
  66.     296 else:
  67.     297   raise ValueError("Either a title or a pageid must be specified")
  68. --> 299 self.__load(redirect=redirect, preload=preload)
  69.     301 if preload:
  70.     302   for prop in ('content', 'summary', 'images', 'references', 'links', 'sections'):
  71.  
  72. File ~/venv/lib/python3.8/site-packages/wikipedia/wikipedia.py:393, in WikipediaPage.__load(self, redirect, preload)
  73.     390   filtered_lis = [li for li in lis if not 'tocsection' in ''.join(li.get('class', []))]
  74.     391   may_refer_to = [li.a.get_text() for li in filtered_lis if li.a]
  75. --> 393   raise DisambiguationError(getattr(self, 'title', page['title']), may_refer_to)
  76.     395 else:
  77.     396   self.pageid = pageid
  78.  
  79. DisambiguationError: "Error (disambiguation)" may refer to:
  80. Error (band)
  81. Error (Error EP)
  82. Errors (band)
  83. "Error" (song)
  84. Error (VIXX EP)
  85. Error (The Warning album)
  86. Error (Lee Chan-hyuk album)
  87. Ohms
  88. Susumu Hirasawa
  89. Error (Hong Kong group)
  90. Error (baseball)
  91. Error (law)
  92. Error message
  93. Error (linguistics)
  94. Errors and residuals
  95. Error term
  96. Err (disambiguation)
  97. I am Error
















Python software and documentation are licensed under the PSF License Agreement.
Starting with Python 3.8.6, examples, recipes, and other code in the documentation are dual licensed under the PSF License Agreement and the Zero-Clause BSD license.
Some software incorporated into Python is under different licenses. The licenses are listed with code falling under that license. See Licenses and Acknowledgements for Incorporated Software for an incomplete list of these licenses.

Python and it's documentation is:
Copyright © 2001-2022 Python Software Foundation. All rights reserved.
Copyright © 2000 BeOpen.com. All rights reserved.
Copyright © 1995-2000 Corporation for National Research Initiatives. All rights reserved.
Copyright © 1991-1995 Stichting Mathematisch Centrum. All rights reserved.

See History and License for complete license and permissions information:
https://docs.python.org/3/license.html#psf-license
  • Recent Pastes