Leave a comment

Some Bash magic, graph, csv

Unix shells are powerful. Scripts are useful.

I am working on a project for which I want to know if the master data is already present in my graph as I’ll process CSV files.

So, the idea is to get all the distinct unique values from a given column in all the csv file and count the

From this list, I then create an array that I use in a Cypher query like

MATCH (n:Label) 
WHERE n.property IN [The Array I created with the values]
RETURN count(n)

Here is the script, in Bash, the trick is to create an awk command for each file. It take an integer as parameter, 1 for the first column and so on (not 0, 0 is for the whole line for awk)

#!/bin/bash
#Get the unique values in a given csv file by providing a column number (starts at 1 in awk)
#$1 means first param value of this script, should be an integer
# will form with the $ in the string the column number for awk (again $1 = first column, because $0 is whole line)
# awk 'BEGIN {FS=';'}{print $3}END{}' $file >> sumFile.txt
theline="BEGIN{FS=\";\"}{print $"$1"}END{}"
filename=/tmp/getUniqueValues_sh.date +%F
touch $filename
for file in *_*.csv;
do
   command="/usr/bin/awk '"$theline"' "$file" >>"$filename
   eval $command
done
# display unique values
cat $filename |sort|uniq|more
rm $filename

The script writes in a temp file in /tmp that is removed after use.

Find the file in https://drive.google.com/file/d/1Jo5tY7IaEek4NFRTZWea4TvSw8KdKC45/view?usp=sharing

The column separator is ; you can change it easily

Execute it like this “getUniqueValues.sh 2” for the second column

Leave a comment

Neo4j Favorites

You certainly already know that you can add a Cypher statement as a favorite by hitting the star button next to the input box

A very useful trick is to start it with a comment, so that the comment appears as the title of the favorite in the favorites tab.

// Isolated nodes - count by label
MATCH (n)
WHERE NOT ((n)--())
RETURN DISTINCT (labels(n)), count(n)

Will be listed as “Isolated nodes – count by labels”

Useful isnt it ?

Leave a comment

Linux Mint 20 / dual monitors / redshift

I previously was a user of f.lux On my new install of mint20, Redshift is already installed. Was annoyed that it only managed one monitor after I plugged a second. Some searching took me to http://jonls.dk/redshift/ and I tried my luck with creating the configuration file. Made two updates:
  • commented screen=1 at the end
  • updated with the lat/long values for my place
restart redshift and all fine, both monitors are dimmed in sync
Leave a comment

APOC to the rescue #2

In this short post, another APOC feature worth its weight.

The function apoc.any.property takes two parameters, first being the object (node, relation) the second being the name of a property of that object.

It returns the value of any property of an object.   <– genius 🙂

That is super useful when you want to use several columns from a CSV* that have alike names, like pos(swa) pos (fra) pos (eng).

Here is an example to return wanted columns from the line object

WITH ['eng', 'swa', 'fra'] AS lngs
LOAD CSV WITH HEADERS FROM "file:///my_12_langs.csv.file" AS line FIELDTERMINATOR ';'
UNWIND lngs AS lng
WITH 'pos ('+ lng + ')' AS colName, line
RETURN apoc.any.property(line, colName) LIMIT 10

 

 

Here is something a little closer to the real Cypher I use. I create temp nodes. Because then I check for the presence of properties in conditions.

WITH ['eng', 'swa', 'fra'] AS lngs   # just 3 here, has 12
LOAD CSV WITH HEADERS FROM "file:///my_12_langs.csv.file" AS line FIELDTERMINATOR ';'
CREATE (t:TEMP)
SET t += line
WITH t,lngs
UNWIND lngs AS lng
WITH 'pos ('+ lng + ')' AS colName,t
RETURN colName, apoc.any.property(t, colName) LIMIT 50

MATCH (t:TEMP)
DELETE t   # clean up

 

Hope this is useful for you.

Big up to the APOC maintainers.

Leave a comment

APOC to the rescue #1

Picture yourself, querying on a text property, looking for duplicates and BOOM you see they not only have the same text but also the same Identifier !!!

Your brain screams “OMG I have nodes with the same identifier ! How can that be !”

Your start to wonder if all the confidence you put on Neo is misplaced.
A true trial of faith.

Doubt not.
This can happen, even though you created a fine constraint on the property, even though Neo is written by fine brains and IDs is MVP level for a DB.
You dont believe it ?

The reason it can happen is data type.

123456 and ‘123456’ are not the same.
So if you (like me) forgot to convert from string to integer in your LOAD CSV script, thats what happens.

To solve this and identify the nodes you just created, APOC can help.
What can APOC not do ?

There is a function apoc.meta.cypher.type(property) returning STRING where the property is stored as a String.

Other values are INTEGER,FLOAT,STRING,BOOLEAN,RELATIONSHIP,NODE,PATH,NULL,MAP,LIST OF ,POINT,DATE,DATE_TIME,LOCAL_TIME,LOCAL_DATE_TIME,TIME,DURATION

MATCH (n:Label)
WHERE apoc.meta.cypher.type(n.property) = 'STRING'
RETURN count(s)

That code will give you an idea of the width of the mistake.

The Cypher code to solve this is coming next 🙂

Leave a comment

Ads

You may know me as a Packt author. This is true.
I have a title published at Packt Learning Neo4J 3.x Second Edition

I am also a long time collaborator of Manning, as a technical reviewer on 5 to 10 books (or more).
Just like everybody, I’d love a big passive income.

So I accepted to become an affiliate thus the banner in the banner.

Currently, I chose to advertise the book of William Lyon on GraphQL

Leave a comment

A story plot

That should be for an action movie with lots of explosions.
I wrote it here https://twitter.com/wadael/status/1167030702229086208 in a thread.
Here it is.

Story plot: a Big pharma wages War against BIG brazilian farmers to save the Forest people. Its a private army hired for PR Its a beautiful international effort. Afghan vets from all sides. Former Yugoslavia soldiers awaited in La Haye. Former special forces ….

… of the ”good” nations. A few members of ”Snipers for Good” All united under the pharma banner in the interest of humanity Looks good

In the bg, its mob manners, kidnapping of pdt old mama to put a stop to fires But cardiac, she dies 😦 It pours oil on the fires. In both meanings. Threats, attacks, defenses a lot of action Ppl rebel, peaceful sitin demonstrations every day, in the dust and fumes coming..

..from the burning areas are violently attacked by riot police. RP depicted as bloodthirsty, trigger happy pigs and rapists. All ends as well as possible after prsdt suicides But shadowy politics go back to the shadows, awaiting their time (or the sequel) Ultimately, ….

. the lead action hero learns the real why of all this costly save the trees operation It was to save the natives living in the forest Satisfaction for hero and audience Then …. We learn of a new miracle diet pill marketed by bigpharma. …. Wait for it …….

Using gut bacteria from the natives, thanks to collected feces. As accompanying files show, Native where kidnapped for trials. Concluded that only forest food and conditions would secure production. Thus the military intervention. The end

Leave a comment

My 3D designs

I share them on Thingiverse

https://www.thingiverse.com/jeronimobenbattoun/designs

 

So far, propellers to create soap bubble, a mold for (big) sushis, a smal-to-big bit adapter for screwdiver

ecfobwgxoaanib7

 

 

 

Leave a comment

First STL design

Been playing with Fusion360.

Done a link that should help with your usb keys laying around.Capture

Print several, assemble them. Nail the first on the wall

Download here

Warning : not tested as I aint got no printer yet

Leave a comment

[SOLVED] Cheap CH340g-based Arduino Nano clones not detected/cant upload

I have spent a few hours trying things and looking for answers about why two brand new Nano clones were not detected on two computers with three OSes (2 Mints with != kernels, and windows 7).

First thing first, I guessed it could be a cable problem.
Well, boards were on, connected my old G1, had a notification.
Cable was presumed OK.

I looked for fancy drivers. Found a 341SER driver for Linux, with kernel up to 3.13.
Uh oh. Problem. However, I worked with a third nano on those pcs. So except if there was a regression in the kernel, it should “logically” work a year later.

I read many pages, found the windows driver installed it. No success.

I’ve read that using a USB3 port may damage the Nanos. To be verified.

I have updated the kernel on a pc to 5.1, to test on 4.x and 5.x kernels.

And of course, I have tried a lot of combinations in the IDE for board/chip/programmer

I’ll stop here the list of everything I tried.

IN THE END, IT WAS THE CABLE !

Using Nano model in the IDE does not work, I am getting

avrdude: stk500_recv(): programmer is not responding
avrdude: stk500_getsync() attempt 1 of 10: not in sync: resp=0x00

 

I use : Arduino Duamilanove ou Diecimila , Atmega 328P, AVR ISP

TODAY, the cables box 1 (I have  2) where I store “cables that can be useful someday” was INDEED useful.

Do not underestimate the mighty issue-solving powers of the cables box (and double checking).