From 3c68db23bfc7cba86567cef736cbf3b824cd197f Mon Sep 17 00:00:00 2001 From: Jidong Xiao Date: Thu, 26 Oct 2023 21:51:42 -0400 Subject: [PATCH] explaining the description and the title --- hws/07_search_engine/README.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/hws/07_search_engine/README.md b/hws/07_search_engine/README.md index 655eb47..8a5f6fc 100644 --- a/hws/07_search_engine/README.md +++ b/hws/07_search_engine/README.md @@ -132,7 +132,7 @@ When searching *Tom Cruise*, your search engine should not include a page which Search Engines like Google will search all types of files on the Internet, but in this assignment, we assume all files we search are HTML files. And we consider an HTML file contains the search query only if the search query can be found within the <body> section of the HTML file. The <body> section, enclosed within the <body></body> tags in an HTML document, represents the primary content area of the web page. -Based on Rule 1 and Rule 2: when the search query is *Tom Cruise*, the third page showed in this image should not be included in your search results, unless the words *Tom Cruise* appears in the other part of the <body></body> section of this web page, which is not displayed here. +Based on Rule 1 and Rule 2: when the search query is *Tom Cruise*, the second page showed in this image should not be included in your search results, unless the words *Tom Cruise* appears in the other part of the <body></body> section of this web page, which is not displayed here. ![alt text](images/tom_cruise.png "tom cruise") @@ -142,6 +142,12 @@ But wait, we see *Tom Cruise* here: That's true, but this line is not in the <body> section of the HTML file, it is created via a meta description tag which is in the <head> section of the HTML file. We will have more details on this in [a later section](#the-description) in this README. +The same thing for this line: + +![alt text](images/tom_cruise_title.png "tom cruise title") + +this line is not in the <body> section of the HTML file, rather, it is created via a title tag which is in the <head> section of the HTML file. More details on this in [a later section](#the-title) in this README. + ### Rule 3. Search Query: No More Than 3 Words We also limit the user to search no more than 3 words in each query. Based on this rule, we allow users to search *Tom*, *Tom Cruise*, *Tom and Jerry*, but *Tom Hanks Academy Award* is not allowed, as it contains more than 3 words.