Click here to Skip to main content
15,946,342 members
Everything / Artificial Intelligence / Data Science

Data Science

data-science

Great Reads

by Shakhansho
An overview of newly written package anonympy and a walk-through some of its methods and functionality
by Oliver Kohl D.Sc.
Step-by-step guide on data-cleaning
by Glenn Viroux
How to use multi indexing in pandas, with practical use cases such as monitoring changes in the earth surface temperature.

Latest Articles

by Oliver Kohl D.Sc.
Step-by-step guide on data-cleaning
by Glenn Viroux
How to use multi indexing in pandas, with practical use cases such as monitoring changes in the earth surface temperature.
by Shakhansho
An overview of newly written package anonympy and a walk-through some of its methods and functionality

All Articles

Sort by Score

Data Science 

8 Nov 2023 by Gerry Schmitz
To interpolate, you need the "relative position" of the missing point, relative to the "two" points on "either side" of the missing value. In other words, you can't interpolate unless you have at least 2 points to start with, and you know...
12 Sep 2021 by CPallini
Quote: I'm wondering where is the object of the (*age) member in the copy constructor body? Quote: I've tried to debug and see locals to see what's going behind I see a pointer variable called this is being created when the constructor is called...
20 May 2021 by Jarek Szczegielniak
There are many different types of algorithms in machine learning. Among others, you have supervised, unsupervised and reinforcement learning tasks. Each of these types have different input and output requirements. Classification and regression...
9 Feb 2022 by Shakhansho
An overview of newly written package anonympy and a walk-through some of its methods and functionality
9 Dec 2023 by Kenneth Haugland
First off, there is no code to give feedback to here, so the answer is not very good in that respect. In mathematics, there are different ways of interpolating points. There is the Lagrangian ineterpolation, or Newton interpolation. These are...
8 Jul 2024 by Dave Kreskowiak
The problem is you're subtracting the joined date from the current date and seeing if it is EQUAL to 9. That will only ever match a single date. In your example, that would be everyone who joined only on 6/21, or (6/30 - 6/21 = 9). If you're...
8 Jul 2024 by Wendelius
Perhaps something like SELECT * FROM users WHERE MOD(DATEDIFF(CURDATE(), joineddate), 9) = 0
12 Sep 2021 by BernardIE5317
It seems to me there is no need to utilize a pointer to store age as shown below If you wish to utilize a pointer you should delete it before copying to it in the copy constructor else you will have memory leakage As to your question if I...
28 Dec 2021 by Dave Kreskowiak
Coming up with the "step-by-step" is your job. That's the point of the exercise. You have to think about the problem and come up with the algorithm that solves it. Do it on paper, one step at a time, and document precisely what you're doing at...
29 Dec 2021 by Maciej Los
Have you seen this: c# - Roman numerals to integers - Stack Overflow[^]?
18 Jan 2022 by OriginalGriff
While we are more than willing to help those that are stuck, that doesn't mean that we are here to do it all for you! We can't do all the work, you are either getting paid for this, or it's part of your grades and it wouldn't be at all fair for...
8 Mar 2022 by OriginalGriff
This is the same thing you asked yesterday: How do I solve these questions[^] And I see no difference in "What I have tried" - so the answer is still the same. Where is problem 1 solution? What testing did you do to check your results so far? ...
18 Mar 2022 by Dave Kreskowiak
There's no such thing as "best scraping code". Sites change their formats all the time and the code you use for one site won't work for the next one.
6 May 2022 by Richard MacCutchan
Line 24 has an unneeded semi-colon after the close brace. And there should be another close brace on a line after line 98.
6 May 2022 by OriginalGriff
To add to what Richard has rightly said, syntax errors like these are a fact of life, and we all meet them on a daily basis - often several per minute if you aren't careful. So it's worth investing a couple of minutes in learning how to spot and...
24 Jun 2022 by OriginalGriff
SELECT ID, ValueColumn, ValueColumn - ValueColumn * 40 / 100 As [40% Off] FROM MyTable
13 Oct 2022 by Sandeep Mewara
Did you try something like below. Filtering: df[df.name.str[-1].apply(lambda x: x in ['a', 'e', 'i', 'o', 'u'])] Refer: Lambda and filter in Python Examples[^] Saving to CSV: df.to_csv('new-location\\output_filtered_sample1.csv', index=False,...
18 Dec 2023 by Oliver Kohl D.Sc.
Step-by-step guide on data-cleaning
5 Apr 2021 by Eddie Winch
Hi there, I am using the following Line Of Code, in Pandas :- diff = df6.loc[~df6['Venue'].isin(df1['Venue'])] diff And I am not getting, the DataFrame Output result I want. I wan't to have the DataFrame Rows Showing, where any Rows in the...
20 May 2021 by sumit mandal10
Below is the example for classification in which we take both X = dataset.iloc[:,[2,3]].values y = dataset.iloc[:,4].values Where as in clustering we are only taking X = dataset.iloc[:,[3,4]].values Full Code for reference purpose #...
10 May 2021 by Member 15192276
Use just the recycled_waste feature and the energy_saved in the energy dataset to get the energy saved for each waste type and the total sum.
19 May 2021 by SeanChupas
This is a massive project so I'm not sure what you are expecting us to tell you. It sounds like you need to design this out better, create a database to store your data, and then start writing code to do what you want.
19 May 2021 by CHill60
Quote: Understood, I just feel so helpless with my situation. I've sat for hours and I just can't figure on how to approach this. Let's try to get you started. This article How to Write Code to Solve a Problem, A Beginner's Guide[^] is a good...
30 Jul 2021 by Member 15307080
so for the first data set(2015,2016,2017) you have to just divide the recycle_rate/Enery_saved for the particular waste type. for the second data set(2018,2019) you have to calculate the recycle_rate by doing...
2 Aug 2021 by nikita agarwal Jun2021
I am trying to make a 3D plot of a galaxy catalog and have a large amount of x,y,z coordinates and data value (w4) stored in seperate hdf5 files. Since the data content is huge, I have tried binning them. The output is however taking forever to...
2 Aug 2021 by Richard MacCutchan
You have already posted this question at Creating bins of 3D points (large dataset) with Python, taking long time to load[^]. Please do not repost.
2 Aug 2021 by Dave Kreskowiak
Well, there's a couple bottlenecks. First, Python is an interpreted language so it's going to be slower than compiled languages. Second, try and read 5.5GB of data and do NOTHING with it. Just read the files and throw the content you read away....
12 Sep 2021 by walied sharkawy
I've written a code in which I'm passing class as a parameter in fucntion the copy constructor for the class is written too in the body of the code #include #include using namespace std; class Move { private: int* age;...
9 Nov 2021 by Ashutosh sharma Sep2021
I want the maximum number of transactions by each customer to a single merchant so I grouped the table by cid after that I am taking the number of times transactions is done by value_counts() which gives repeatition count but how can i get only...
14 Oct 2021 by Member 15391500
Hey, I have a problem with OrdinalEncoder. I'm using OrdinalEncoder to encode this dataset: Loan Prediction - Analytics Vidhya | Kaggle[^] , but there are 2 features (Dependants and Education) that are -1 for all 614 rows. Can you please tell me...
27 Dec 2021 by OriginalGriff
While we are more than willing to help those that are stuck, that doesn't mean that we are here to do it all for you! We can't do all the work, you are either getting paid for this, or it's part of your grades and it wouldn't be at all fair for...
15 Feb 2022 by OriginalGriff
If you are going to use a database, use it properly - in a format where that data can be stored a single time and queried within that DB. Storing Excel files in your DB does not pass the second criteria and very often fails the first as well. ...
15 Feb 2022 by Maciej Los
In addition to solution #1 by OriginalGriff, i'd suggest to read these: Using SqlDependency for data change events[^] Query Notification using SqlDependency and SqlCacheDependency[^]
19 Mar 2022 by RickZeeland
Maybe one of these web-scraping-services[^]
29 Mar 2022 by Member 15564814
it's a little bit complicated , i have this dataframe : ID TimeandDate Date Time 10 2020-08-07 07:40:09 2022-08-07 07:40:09 10 2020-08-07 08:50:00 2022-08-07 08:50:00 10 2020-08-07 12:40:09 2022-08-07 ...
19 Apr 2022 by darshanbs-wq
I have a dataset of covid total case total death and country name. I want add all details of total case,death and country name in world map using folium library and using popmarker, but it is showing "positional argument follows keyword...
19 Apr 2022 by Richard MacCutchan
You have positional parameters after a keyword parameter which is invalid: folium.Marker( location=[data.iloc[i]['Latitude'], data.iloc[i]['Longitude']], popup=data.iloc[i]['Total_Case'], +' ' + data.iloc[i]['Total_Death'] +' ' +...
19 Aug 2022 by Richard MacCutchan
You need to get the actual structure of the image and find all the cells that contain the colour you want to change. And remember that there are many shades of blue. See Image Module - Pillow (PIL Fork) 9.2.0 documentation[^].
14 Oct 2022 by MR-XAN777
I have a CSV file that has data about users. and I want to filter them by name endings. it is an example of my CSV file: index, name, id, status 1, John, 500, online 2, Anne, 485, offline 3, Angel, 856,...
27 Oct 2022 by MKGoal25
I have a dataframe including 520 individual identification #s and their frequency count over 10 bin/indices. I want to fit nls() for each of them & save their coef and r^2 values each in a new column. structure(list(User.ID = c("37593", "38643",...
1 Nov 2022 by jplavorr
The sample of the dataset I am working on: # List of Tuples matrix = [(1, 0.3, 0, 0.7, 30, 0, 50), (2, 0.4, 0.4, 0.3, 20, 50, 30), (3, 0.5, 0.2, 0.3, 30, 20, 30), (4, 0.7, 0, 0.3, 100, 0, 40), (5, 0.2,...
7 Jan 2023 by msj here
I have this Dataset: Accession Number Value 0 0001754960-22-000233 258 1 0001754960-22-000233 254 2 0001754960-22-000233 254 3 0001754960-22-000233 259 4 0001754960-22-000233 3753 ... ... ... 2568329 0000950123-22-007614 13660...
12 Apr 2023 by OriginalGriff
While we are more than willing to help those that are stuck, that doesn't mean that we are here to do it all for you! We can't do all the work, you are either getting paid for this, or it's part of your grades and it wouldn't be at all fair for...
16 May 2023 by Apoorva 2022
I have a dataset consisting of 'N' number of features/variables. One of these columns has missing values which I would like to interpolate. The interpolate() method of Pandas helps in doing so. The input for this method is the name of the column...
16 May 2023 by Apoorva666
I want to experiment with different interpolation techniques. Some determining factors for choosing the appropriate interpolation techniques are: 1. Density of data (Checking if the data is dense or sparse) 2. Dimensionality of data (High...
19 Jun 2023 by Glenn Viroux
How to use multi indexing in pandas, with practical use cases such as monitoring changes in the earth surface temperature.
21 Jun 2023 by Member 16034634
I am making a blog website and want to implement a search alogithm which would be powered by ML. I took a codecademy course on ML but dont know how to combine MERN with ML. I also want to add a similar posts sections and also an algoritm to make...
8 Jul 2024 by mcbain19
I had wrote the below query because I wanted to return the list of every user that joined when the interval was 9 days using curdate() and joineddate. For example if today was 6/30 username joineddate user1 | 6/21 user2 | 6/20 ...
28 Dec 2021 by Patrice T
Quote: How to convert Roman numerals to Integer in C# without any built-in function and step by step 1) study how roman numerals are built Roman numerals - Wikipedia[^] 2) Get many samples and solve them by hand, you are following rules, they...
8 Mar 2022 by danny sanny
from collections import Counter from matplotlib import pyplot as plt import math # -*- coding: utf-8 -*- print("""Problem 1: Replace the word EXAMPLE three times in the next line with three examples of problems that can be solved by data...
19 May 2021 by Ayush1999
I have to prepare an algorithm that finds, relation between data across multiple databases, degree of relationship between databases and find transformation across databases when data moves from one database to other. It is a data lineage...
18 Jan 2022 by Rosaila
this was the questio give : Write a menu-based C program to perform operations for a binary search tree (BST). a. Search an element b. Find minimum c. Find maximum d. Insertion e. Deletion In this code Im unable to perform search and delete...
12 Apr 2023 by Raven Fearlina
https://i.stack.imgur.com/uYYk...
10 May 2021 by Amr Mahmoud 2021
-1 i need to know how can i calculate how much energy in kiloWatt hour (kWh) has Singapore saved per year by recycling, by using this data? waste_type: The type of waste recycled. waste_disposed_of_tonne: The amount of waste that could not be...
20 May 2021 by Member 14844003
I have to write a NLP algorithm for grouping words from dictionary into different clusters eg. technology, food, color etc. and then able to classify new words into these clusters. what could be possible code for this simple algorithm? I am new...
6 May 2022 by Member 15627552
#include #include #define QUEUE_SIZE 5 struct queue { int items[QUEUE_SIZE]; int front; int rear; }; typedef struct queue QUEUE; int insert_rear(int item, QUEUE *q) { if (q->rear == QUEUE_SIZE - 1) { ...
9 Dec 2023 by Apoorva666
Definition of interpolation - Interpolation predicts values at a point by studying its neighbouring points (within the same column), as opposed to data modeling where all the columns are taken into considering when studying the relationship...
21 Aug 2021 by Chandan Nagaraj 2021
I have data in dataframe in this following format: Row_1 AB123, 01-mar-2011, 30-mar-2011, data1, data2 Row_2 CD123, 01-mar-2011, 30-mar-2011, data1, data2 Row_3 CD123, 01-apr-2011, 30-apr-2011, data1, data2 Row_4 EF123,...
29 Dec 2021 by ahmedbelal
How to convert Roman numerals to Integer in C# without any built-in function and step by step What I have tried: How to convert Roman numerals to Integer in C# without any built-in function and step by step
28 Dec 2021 by Gerry Schmitz
Add the following to your dictionary; then proceed from the left; matching the first 2 chars; if not a match, the first 1 char. Remove the 2 or 1 leading matched chars; repeat the lookup. Add the dictionaries values to get the final result. CM...
15 Feb 2022 by Vlad Calin
Hello, fellows! I hope you are doing fine these days. Currently I am working on a project at work, whose final goal is to create an interface between 2 pieces of software (I cannot mention which software, as it is a research project). The steps...
19 Mar 2022 by clearwaylaw
What is the best scraping code out there? I have a team in Pakistan that has been scraping data for our lawyer directory. I'm looking for some kind of software? What I have tried: We have a third party that does some work for us, but it's not...
24 Jun 2022 by Бојан Б
Hello, Got a trouble with the percentage in SQL. To be more precise, it is required to take a certain percentage off of a row in the column. For example (taking 40% of the row's amount). Any help will be appreciated. Thanks. What I have tried: ...
10 Aug 2022 by Gaeth Omari
df_4=pd.DataFrame() for i in range(100): if df_33[i]!=df_new[i]: print("not matching") else: dff=pd.concat([df_33,df_new]) print(dff) keyError:0 What I have tried: i was getting this error keyError:0
19 Aug 2022 by Member 15667238
from pickletools import uint8 import numpy as np from PIL import Image image=Image.open('flower.jpg') arr=np.array(image) #arr[:,:]=[,,0] print(arr.shape) print(arr.dtype) print(arr.size) print(arr) img=Image.fromarray(arr) ...
13 Oct 2022 by MR-XAN777
I have a .csv file that has data like that: index, name, id 1 john 512 2 Anne 895 3 Angel 897 4 Lusia 777 So I want to filter them by name endings and get names only which have vowel endings. And the result...
14 Oct 2022 by Fury_5698
I have many csv files with dataframes looking like these. Dummy Data: df1
14 Oct 2022 by Richard MacCutchan
filtereddf.to_csv('output_filt...
8 Nov 2023 by Apoorva666
'Interpolation' is a technique that is used to predict a null/empty value by studying its neighbouring points. Interpolation doesn't take into consideration the entire dataset when predicting the missing value of a column. It only considers the...