0

My project is a news feed application. I can successfully save data gotten from the API to firebase. However, I have duplicate news data saved on my database.

How can I prevent this. See below my code:

// fetch news api
getData = (req, res) => {
return request(newsURL)
.then(news => (news.articles))
.then(infos => {infos.map((info) => {
    const newsDoc = {
        author: info.author,
        content: info.content,
        description: info.description,
        publishedAt: info.publishedAt,
        source: info.source.name,
        title: info.title,
        url: info.url,
        urlToImage: info.urlToImage,
        likeCount: 0,
        commentCount: 0
    }
    db.collection("feeds").add(newsDoc)
        .then((doc) => {
    const finalNews = newsDoc;
    finalNews.feedsId = doc.id;
    // res.json(finalNews)
})
.catch(err => console.error(err))
})})
};

I think a unique identifier of all the news data is the URL.

The question in this post - Skip Duplicates in Firebase Database, is related to mine, but I can seem to get it to work

Please how can I save unique news data without duplicates

See example of data from firebase database:

// original
{
    "feedsId": "Cjy5lW6g6StTVj1GjpQk",
    "author": "El Confidencial",
    "content": "Germán Loera, un conocido 'youtuber' en Mçexico de 25 años, ha sido condenado a 50 años de prisión por un delito de secuestro cometido en Chihuaha en 2018. Los secuestrados reclamaron la cantidad de dinero del rescate en bitcoin. Según ha informado la Fiscalí… [+1153 chars]",
    "description": "Germán Loera, de 25 años, ha sido condenado a 50 años de prisión junto a otros cinco hombres por secuestrar en 2018 a una abogada",
    "publishedAt": "2020-03-04T17:45:00Z",
    "title": "A prisión un 'youtuber' mexicano por un secuestro y pedir el rescate en bitcoins",
    "url": "https://www.elconfidencial.com/amp/mundo/2020-03-04/prision-youtube-mexico-secuestro-bitcoin_2482519/",
    "urlToImage": "https://www.ecestaticos.com/imagestatic/clipping/5c9/56e/5c956e35512404b3022b56dbe0b80c3b/a-prision-un-youtuber-mexicano-por-un-secuestro-y-pedir-el-rescate-en-bitcoins.jpg?mtime=1583343920",
    "likeCount": 0,
    "commentCount": 0
},

// duplicate
{
    "feedsId": "MqqFnixBwQmA0CoBzwr2",
    "author": "El Confidencial",
    "content": "Germán Loera, un conocido 'youtuber' en Mçexico de 25 años, ha sido condenado a 50 años de prisión por un delito de secuestro cometido en Chihuaha en 2018. Los secuestrados reclamaron la cantidad de dinero del rescate en bitcoin. Según ha informado la Fiscalí… [+1153 chars]",
    "description": "Germán Loera, de 25 años, ha sido condenado a 50 años de prisión junto a otros cinco hombres por secuestrar en 2018 a una abogada",
    "publishedAt": "2020-03-04T17:45:00Z",
    "title": "A prisión un 'youtuber' mexicano por un secuestro y pedir el rescate en bitcoins",
    "url": "https://www.elconfidencial.com/amp/mundo/2020-03-04/prision-youtube-mexico-secuestro-bitcoin_2482519/",
    "urlToImage": "https://www.ecestaticos.com/imagestatic/clipping/5c9/56e/5c956e35512404b3022b56dbe0b80c3b/a-prision-un-youtuber-mexicano-por-un-secuestro-y-pedir-el-rescate-en-bitcoins.jpg?mtime=1583343920",
    "likeCount": 0,
    "commentCount": 0
},

1 Answer 1

1

The URL is indeed a unique identifier that you could use.

In your code you are simply adding, try adding a check for the url like the example below:

let citiesRef = db.collection('feeds');
let query = citiesRef.where('url', '==', newsDoc.url).get()
    .then(snapshot => {
        if (snapshot.empty) {
            console.log('No matching documents.');
            db.collection("feeds").add(newsDoc).then((doc) => {
                    const finalNews = newsDoc;
                    finalNews.feedsId = doc.id;
                })
                .catch(err => console.error(err))
        }
    })
    .catch(err => {
        console.log('Error getting documents', err);
    });
Sign up to request clarification or add additional context in comments.

8 Comments

hi, it did not work, it is still saving duplicate items
Are the URLs of these new duplicates unique? could you share a sample?
i have added the response from db
you are using node.js, correct? change the doc.exists for doc.empty, I will update the code from the answer so you can copy it
not working.. see error message: TypeError: db.collection(...).update is not a function
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.