Text summarizer produces second half of exampe text rather than complete text summary in 2 sentences plus keywords. Where does the summary fail?Python

Ask Question

Asked today

Modified today

Viewed 25 times

App.py runs and produces the spinner feature. However, it outputs the second half of the example wiki text file instead of producing a simple summary generated by the model.py script. App.py and model.py are two separate files. The intention was to produce a language model that uses statistics to produce a results summary. Where is the error?

# -------------------- App.py --------------------

import argparse # Input analysis (parsing) processed with grammar.
import sys # System exit.
import os # File path.
import time # Timestamping.

# Import (AI text summarization and load text file) functions from called upon modules.
from .model import summarize_text
from .utils import load_text

# Spinner animation for longer operations. Heighten user experience, (UX).
def spinner():
  # Function to cycle through characters.
  while True:
    for frame in "|/-\\":
      yield frame

# Define main CLI function as script runs. 
def main():
  # Initialize parser (analysis) with description.
  parser = argparse.ArgumentParser(
    description="AI-powered text summarizer CLI tool"
  )
  # Input the text file we want to consolidate.
  parser.add_argument(
    "input_file",
    type=str,
    help="Path to the text file you want to summarize"
  )
  # Output text file naming.
  parser.add_argument(
    "-o", "--output",
    type=str,
    help="Optional output file to save the summary"
  )
  # Allow for plenty of, (verbose), documentation of each step of process (ideal for teaching and auditing).
  parser.add_argument(
    "-v", "--verbose",
    action="store_true",
    help="Enable verbose mode for detailed logs"
  )

  # Analyse the user argument.
  args = parser.parse_args()

  # Qualify input file.
  if not os.path.exists(args.input_file):
    print(f"Error: File not found: {args.input_file}")
    sys.exit(1)

  # Load and read text file.
  if args.verbose:
    print(f"[INFO] Loading file: {args.input_file}")
  text = load_text(args.input_file)
  
  # Prevent empty file summarization.
  if not text.strip():
    print("Error: File is empty. Cannot summarize an empty text file.")
    sys.exit(1)

  # Show summarization is rendering
  if args.verbose:
    print("Generating summary...")
  
  # Display spinner.
  spin = spinner()
  for _ in range(10):
    sys.stdout.write(next(spin))
    sys.stdout.flush()
    time.sleep(0.05)
    sys.stdout.write("\b")

  # Executive AI summarization.
  summary = summarize_text(text)
  
  # Print summary to console.
  print("\n=== SUMMARY RESULT ===\n")
  print(summary)

    # Save output to file if requested.
  if args.output:
    with open(args.output, "w", encoding="utf-8") as f:
      f.write(summary)

    print(f"\n Summary saved to: {args.output}")


# Direct execution only.
if __name__ == "__main__":
    main()

# ------------------- Model.py -------------------

import re
from collections import Counter

def summarize_text(text):
    """
    Lightweight extractive summarizer:
    - Extracts sentences
    - Computes keyword frequency
    - Selects best 1–2 sentences
    - Produces a clean, grammatical summary
    """

    # --- 1. Split text into sentences ---
    sentences = [s.strip() for s in re.split(r'(?<=[.!?])\s+', text) if s.strip()]

    if not sentences:
        return "No usable sentences found in the text."

    # --- 2. Count word frequencies ---
    words = re.findall(r'\b\w+\b', text.lower())
    word_counts = Counter(words)

    stopwords = {
        "the", "and", "of", "in", "to", "a", "is", "for", "on", "with",
        "as", "by", "an", "at", "that", "this", "it", "from", "be", "are"
    }

    keywords = [w for w, _ in word_counts.most_common(20) if w not in stopwords]

    if not keywords:
        keywords = list(word_counts.keys())[:10]

    # --- 3. Score each sentence ---
    def sentence_score(sentence):
        s_words = re.findall(r'\b\w+\b', sentence.lower())
        return sum(word_counts.get(w, 0) for w in s_words if w in keywords)

    ranked = sorted(sentences, key=sentence_score, reverse=True)

    # --- 4. Take top 1–2 sentences ---
    best_sentences = ranked[:2]
    summary = " ".join(best_sentences)

    # --- 5. Fix spacing, ensure full sentences ---
    summary = summary.replace("  ", " ").strip()
    if not summary.endswith((".", "!", "?")):
        summary += "."

    # --- 6. Include missing core keywords in English form ---
    top_keywords = keywords[:5]
    missing = [k for k in top_keywords if k not in summary.lower()]

    if missing:
        if len(missing) == 1:
            summary += f" Key concept emphasized: {missing[0]}."
        else:
            summary += (
                " Key concepts emphasized: "
                + ", ".join(missing[:-1])
                + f", and {missing[-1]}."
            )

    return summary

Thank you for your answers.

asked 21 hours ago

metaknews

12 bronze badges

New contributor

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Text summarizer produces second half of exampe text rather than complete text summary in 2 sentences plus keywords. Where does the summary fail?Python

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest