Type Here to Get Search Results !

Java Regex 2 - Duplicate | HackerRank Solution By CodingHumans |

0

Java Regex 2 - Duplicate Words

Hint

Get the sentence.
Form a regular expression to remove duplicate words from sentences.

regex = "\\b(\\w+)(?:\\W+\\1\\b)+";

The details of the above regular expression can be understood as:

“\\b”: A word boundary. Boundaries are needed for special cases. For example, in “My thesis is great”, “is” wont be matched twice.
“\\w+” A word character: [a-zA-Z_0-9]
“\\W+”: A non-word character: [^\w]
“\\1”: Matches whatever was matched in the 1st group of parentheses, which in this case is the (\w+)
“+”: Match whatever it’s placed after 1 or more times
Match the sentence with the Regex. In Java, this can be done using Pattern.matcher().
return the modified sentence.

Task 

In this challenge, we use regular expressions (RegEx) to remove instances of words that are repeated more than once, but retain the first occurrence of any case-insensitive repeated word. For example, the words love and to are repeated in the sentence I love Love to To tO code. Can you complete the code in the editor so it will turn I love Love to To tO code into I love to code?

To solve this challenge, complete the following three lines:

1. Write a RegEx that will match any repeated word.
2. Complete the second compile argument so that the compiled RegEx is case-insensitive.
3. Write the two necessary arguments for replaceAll such that each repeated word is replaced with the very first instance the word found in the sentence. It must be the exact first occurrence of the        word, as the expected output is case-sensitive.


Note: This challenge uses a custom checker; you will fail the challenge if you modify anything other than the three locations that the comments direct you to complete. To restore the editor's original stub code, create a new buffer by clicking on the branch icon in the top left of the editor.

Input Style

The following input is handled for you the given stub code:

The first line contains an integer, , denoting the number of sentences.
Each of the  subsequent lines contains a single sentence consisting of English alphabetic letters and whitespace characters.

Constraints

Each sentence consists of at most 10^4  English alphabetic letters and whitespaces.
1<=n<=100

Output Style

Stub code in the editor prints the sentence modified by the replaceAll line to stdout. The modified string must be a modified version of the initial sentence where all repeat occurrences of each word are removed.

Sample Input

5
Goodbye bye bye world world world
Sam went went to to to his business
Reya is is the the best player in eye eye game
in inthe
Hello hello Ab aB

Sample Output

Goodbye bye world
Sam went to his business
Reya is the best player in eye game
in inthe
Hello Ab

Explanation

1. We remove the second occurrence of bye and the second and third occurrences of world from Goodbye bye bye world    world world to get Goodbye bye world.
2. We remove the second occurrence of went and the second and third occurrences of to from Sam went went to to to his    business to get Sam went to his business.
3. We remove the second occurrence of is, the second occurrence of the, and the second occurrence of eye from Reya is is    the the best player in eye eye game to get Reya is the best player in eye          game.
5.The sentence in inthe has no repeated words, so we do not modify it.
6. We remove the second occurrence of ab from Hello hello Ab aB to get Hello Ab. It's important note that our matching is case-insensitive, and we specifically retained the first occurrence of the        matched word in our final string.




Recommended: Please try your approach on your integrated development environment (IDE) first, before moving on to the solution.

Few words from CodingHumans : Don't Just copy paste the solution, try to analyze the problem and solve it without looking by taking the the solution as a hint or a reference . Your understanding of the solution matters.

HAVE A GOOD DAY 😁



Solution
( Java)

import java.util.Scanner;

import java.util.regex.Matcher;

import java.util.regex.Pattern;


public class DuplicateWords {

    public static void main(String[] args) {
        String regex = "\\b(\\w+)(?:\\W+\\1\\b)+";
        Pattern p = Pattern.compile(regex,Pattern.CASE_INSENSITIVE);
        Scanner in = new Scanner(System.in);
        int numSentences = Integer.parseInt(in.nextLine());
        
        while (numSentences-- > 0) {
            String input = in.nextLine();
            
            Matcher m = p.matcher(input);
            
            // Check for subsequences of input that match the compiled pattern
            while (m.find()) {
                input = input.replaceAll(m.group(), m.group(1));
            }
            
            // Prints the modified sentence.
            System.out.println(input);
        }
        
        in.close();
    }
}    




If you have any doubts regarding this problem or  need the solution in other programming languages then leave a comment down below .

Post a Comment

0 Comments

Top Post Ad

Below Post Ad