MongoDB

I recently had a request to remove part of a text field from MySQL, JSON data but stored as text in MySQL 5.6.

Has this been Mongo, this could have been a pretty quick update of the data within an array:

Find the data

db.accounts.find({ “vendor.account_number”: { $exists: true }}, {“vendor”:1,”timestamp”:1}).forEach(printjson)
Update the data

db.accounts.update(
{ “vendor.account_number”: { $exists: true }},
{ $set:{“vendor.name”: “XXXXXX”,”vendor.account_number”: “XXXXXX”}}, {multi:true}
)

On MySQL, this was a bit trickier. I was able to pull out the segment to be updated using SUBSTRING. Once I was able to identify the SUBSTRING values, I used REPLACE with SUBSTRING_INDEX for the substitution “from” values.

SUBSTRING_INDEX Syntax: SUBSTRING_INDEX(str,delim,count)
- My data had variable length fields, but were consistent on the prefix for the data, and the character immediately at the end of the section:
- SUBSTRING_INDEX(request, ‘”vendor”:{“name”:”‘, -1)
  - str = request (my column)
  - delim = “vendor”:{“name”:”‘ — name was potentially used in more locations, so the longer string
  - count = -1
    - “If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned”
REPLACE Syntax: REPLACE(str,from_str,to_str)
- Since my data varied on each record, I had to create a new “from string” to match. I set the update on the same column twice as I was running multiple modifications.
  - REPLACE(request, SUBSTRING_INDEX(SUBSTRING_INDEX(request, ‘”vendor”:{“name”:”‘, -1),'”,’,1), ‘XXXXXXXX’)
    - str = request — my column
    - from_str = SUBSTRING_INDEX(SUBSTRING_INDEX(request, ‘”vendor”:{“name”:”‘, -1),'”,’,1) — my string subsection, variable by row
    - to_str = ‘XXXXXXXX’ — my replacement/”to” value

Find the data

SELECT id
, SUBSTRING_INDEX(SUBSTRING_INDEX(request, ‘”vendor”:{“name”:”‘, -1),'”,’,1) AS vendor_name_part
, SUBSTRING_INDEX(SUBSTRING_INDEX(request, ‘”account_number”:”‘, -1),'”}’,1) AS account_number_part
, request
FROM call_logs
WHERE request LIKE ‘%vendor%account_number%’;
Update the data

UPDATE call_logs
SET request = REPLACE(request, SUBSTRING_INDEX(SUBSTRING_INDEX(request, ‘”vendor”:{“name”:”‘, -1),'”,’,1), ‘XXXXXXXX’),
request = REPLACE(request, SUBSTRING_INDEX(SUBSTRING_INDEX(request, ‘”account_number”:”‘, -1),'”}’,1), ‘XXXXXXXX’)
WHERE request LIKE ‘%vendor%account_number%’;

Before the update:

Screen Shot 2017-06-15 at 4.50.45 PM

After the update:

Screen Shot 2017-06-15 at 4.51.36 PM

As a SQL DBA, I have some common queries I use to get by certain information, organized by date. I’m starting to do a bit more work with Mongo, and it can be a bit frustrating trying to translate what I can do with SQL into Mongo.

Here is one I figured out recently, I thought it may be useful for others. I want to get a count of items for a certain period of time, grouped & ordered by date.

Take this example in MySQL:

I’m formatting the CreationDate in the select, which will be used by the Grouping & Ordering. This parses out the additional date fields that would affect the group by (e.g. 7/1/15 10:51 and 7/1/15 11:51 can now be grouped)
Count: This is the data to be grouped by date
Where clause: Just get me the date since 7/1/15
Group by: Use the OrderDate for grouping
Order By: Use the OrderDate for ordering, use Ascending order (can use DESC for reverse order)

SELECT DATE_FORMAT(creationdate, '%m-%d-%y') AS 'OrderDate', COUNT(DISTINCT O.ORDER_ID) AS OrderCount, COUNT(OL.ORDERLINE_ID) AS 'OrderLines' FROM ORDERLINE OL JOIN ORDER O ON OL.ORDERID = O.ORDER_ID WHERE O.creationdate > '2015-07-01' AND O.DELETED = 0 AND OL.DELETED = 0 GROUP BY OrderDate ORDER BY OrderDate ASC

Sample Data:

OrderDate	OrderCount	OrderLines
7/1/15	163	12553
7/2/15	160	16981
7/3/15	18	1194
7/5/15	7	2488
7/6/15	176	13328
7/7/15	148	13586
7/8/15	186	12795
7/9/15	143	17210
7/10/15	84	7088

On Mongo, I can do something similar – In this instance I’m counting the documents, but there are additional functions to sum/avg/count etc date within documents.

Match: In the aggregate, it’s like the Where clause: just get me the documents since 7/1/15
Group: This is where I’m defining the grouping criteria. There is probably a prettier way to do this, but I’m breaking out the Year, Month & Day for each timestamp, similar to where it’s seen in the SQL Select portion.
Count: the data I’m grouping by date
Sort: Order by the _id group, 1 is ascending order, -1 is descending order.

db.Orders.aggregate( [ { $match : { timestamp :{ $gt: ISODate("2015-07-01T00:00:00.000Z")}} }, { $group : { _id : { year: { $year: "$timestamp" } , month: { $month: "$timestamp" }, day: { $dayOfMonth: "$timestamp" }}, count: { $sum: 1 } } }, { $sort: { _id: 1 } } ] )

Sample data:

{ "_id" : { "year" : 2015, "month" : 7, "day" : 1 }, "count" : 956 } { "_id" : { "year" : 2015, "month" : 7, "day" : 2 }, "count" : 107 } { "_id" : { "year" : 2015, "month" : 7, "day" : 3 }, "count" : 73 } { "_id" : { "year" : 2015, "month" : 7, "day" : 4 }, "count" : 73 } { "_id" : { "year" : 2015, "month" : 7, "day" : 5 }, "count" : 73 } { "_id" : { "year" : 2015, "month" : 7, "day" : 6 }, "count" : 6385 } { "_id" : { "year" : 2015, "month" : 7, "day" : 7 }, "count" : 106 } { "_id" : { "year" : 2015, "month" : 7, "day" : 8 }, "count" : 3 } { "_id" : { "year" : 2015, "month" : 7, "day" : 9 }, "count" : 2 }

	Capturing Blocking \|… on Capturing Foreign Key Err…
	Capturing Blocking \|… on Capturing Deadlocks
	Capturing Foreign Ke… on Capturing Deadlocks
	dineshramitc on Group & Count Objects by D…
	dineshramitc on SQL Server Peer-to-Peer Replic…

The Polyglot DBA

Multiplatform DBA – mainly focusing on MySQL, SQL Server & MongoDB

Find & Replace Partial Strings

Group & Count Objects by Day – Mongo