Personalise your emails with recommenders (part1).

This is the first post of our Smarter Segments series where we will explore and implement the stepping stones needed to achieve personalised product recommendations integrated into existing email marketing campaigns. We will use AWS Personalize to create the recommendations and AWS Pinpoint to deliver the emails but the techniques here should work with any ESP

In order to create a recommendation engine that you can use for your email marketing campaigns, we need to 1) set up the recommendation engine 2) to query it.

What we doing ?

A multi-part series to make a minimum viable ML pipeline that will satisfy a basic use case:

The Use Case

You are an e-commerce store, with a 100+ daily active users that have made in total 10K+ interactions with your products (purchased, viewed, clicked_for_info).

You’re about to send your monthly newsletter to your users. Only this time you want to add a section in the email where you’ll display a list of products that you’l recommend to them specifically – helping your emails become more personalised therefore better conversions.

The main component that constitute your recommender engine is a Dataset Group; this component will manage the access to your data, and the recommender; this component will contain the analytical method to achieve different types recommendation.

The steps needed

  • Data acquisition: we will begin with acquiring the data, specifically we will need 3 sets of data.
    • The Interaction data: This is the most important set. This set of data will contain the columns like the user id, item id, and interaction type.
    • The items data: this is where we keep the details of the items stored
    • The user data: this is where we store users details.
  • Data cleaning and processing: we will add and remove columns appropriately before
  • Creating recommender: this engine will ingest the prepared data and do the ML number crunching.
  • Now that we have the recommender we now need to have it somewhere where we can use it to provide recommendations for our users.
  • Once we obtain the recommendations we need to imbed them in the newsletter email for that user before sending the email altogether (this will be in part 2)

Data acquisition

We will be using the code example for the AWS Personalize Samples as starting points and we’ll modify later to customize. The data we will be using are sample data provided by AWS

Our demo data can be found and accessed using the aws cli. Setting up the AWS CLI is beyond the scope of this blog. With Jupyter Notebook create a file and add the following

<span class="line"><span style="color: #DC322F">!</span><span style="color: #839496">aws s3 cp s3:</span><span style="color: #859900">//</span><span style="color: #839496">retail</span><span style="color: #859900">-</span><span style="color: #839496">demo</span><span style="color: #859900">-</span><span style="color: #839496">store</span><span style="color: #859900">-</span><span style="color: #839496">us</span><span style="color: #859900">-</span><span style="color: #839496">east</span><span style="color: #859900">-</span><span style="color: #D33682">1</span><span style="color: #859900">/</span><span style="color: #839496">csvs</span><span style="color: #859900">/</span><span style="color: #839496">items.csv .</span></span>
<span class="line"><span style="color: #DC322F">!</span><span style="color: #839496">aws s3 cp s3:</span><span style="color: #859900">//</span><span style="color: #839496">retail</span><span style="color: #859900">-</span><span style="color: #839496">demo</span><span style="color: #859900">-</span><span style="color: #839496">store</span><span style="color: #859900">-</span><span style="color: #839496">us</span><span style="color: #859900">-</span><span style="color: #839496">east</span><span style="color: #859900">-</span><span style="color: #D33682">1</span><span style="color: #859900">/</span><span style="color: #839496">csvs</span><span style="color: #859900">/</span><span style="color: #839496">interactions.csv .</span></span>

Lets download and inspect the files (and do the imports)

<span class="line"><span style="color: #859900">import</span><span style="color: #839496"> boto3</span></span>
<span class="line"><span style="color: #859900">import</span><span style="color: #839496"> json</span></span>
<span class="line"><span style="color: #859900">import</span><span style="color: #839496"> numpy </span><span style="color: #859900">as</span><span style="color: #839496"> np</span></span>
<span class="line"><span style="color: #859900">import</span><span style="color: #839496"> pandas </span><span style="color: #859900">as</span><span style="color: #839496"> pd</span></span>
<span class="line"><span style="color: #859900">import</span><span style="color: #839496"> time</span></span>
<span class="line"><span style="color: #859900">import</span><span style="color: #839496"> datetime</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">df </span><span style="color: #859900">=</span><span style="color: #839496"> pd.read_csv(</span><span style="color: #2AA198">'./interactions.csv'</span><span style="color: #839496">)</span></span>

What events have we defined as interactions ?

<span class="line"><span style="color: #839496">df.</span><span style="color: #CB4B16">EVENT_TYPE</span><span style="color: #839496">.value_counts()</span></span>

The only house keeping we’ll do to the interactions data is drop the column discount since we’re not going to use.

<span class="line"><span style="color: #839496">test</span><span style="color: #859900">=</span><span style="color: #839496">df.drop(columns</span><span style="color: #859900">=</span><span style="color: #839496">[</span><span style="color: #2AA198">'DISCOUNT'</span><span style="color: #839496">])</span></span>
<span class="line"><span style="color: #839496">df</span><span style="color: #859900">=</span><span style="color: #839496">test</span></span>
<span class="line"><span style="color: #839496">df.sample(</span><span style="color: #D33682">10</span><span style="color: #839496">)</span></span>

thats all the cleaning that we going to do. We’ve only dropped one column, we’re now ready to save the cleaned file.

<span class="line"><span style="color: #839496">df.to_csv(</span><span style="color: #2AA198">"cleaned_training_data.csv"</span><span style="color: #839496">)</span></span>

Thats the only file that we going to upload onto S3 to be used by the recommender to train model for us. items.csv we will keep locally and we’ll use it to query for product details by ID (the recommender will provide us the recommended products by ID)

Lets now create the S3 bucket to upload our cleaned data to – using the our boto client

Python
<span class="line"><span style="color: #586E75; font-style: italic"># Configure the SDK to Personalize:</span></span>
<span class="line"><span style="color: #839496">personalize </span><span style="color: #859900">=</span><span style="color: #839496"> boto3.client(</span><span style="color: #2AA198">'personalize'</span><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">region </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"eu-west-1"</span></span>
<span class="line"><span style="color: #839496">s3 </span><span style="color: #859900">=</span><span style="color: #839496"> boto3.client(</span><span style="color: #2AA198">'s3'</span><span style="color: #839496">)</span></span>
<span class="line"><span style="color: #839496">account_id </span><span style="color: #859900">=</span><span style="color: #839496"> boto3.client(</span><span style="color: #2AA198">'sts'</span><span style="color: #839496">).get_caller_identity().get(</span><span style="color: #2AA198">'Account'</span><span style="color: #839496">)</span></span>
<span class="line"><span style="color: #839496">bucket_name </span><span style="color: #859900">=</span><span style="color: #839496"> account_id </span><span style="color: #859900">+</span><span style="color: #839496"> </span><span style="color: #2AA198">"-"</span><span style="color: #839496"> </span><span style="color: #859900">+</span><span style="color: #839496"> region </span><span style="color: #859900">+</span><span style="color: #839496"> </span><span style="color: #2AA198">"-"</span><span style="color: #839496"> </span><span style="color: #859900">+</span><span style="color: #839496"> </span><span style="color: #2AA198">"user-recommendations"</span></span>
<span class="line"><span style="color: #268BD2">print</span><span style="color: #839496">(</span><span style="color: #2AA198">'bucket_name:'</span><span style="color: #839496">, bucket_name)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #859900">try</span><span style="color: #839496">: </span></span>
<span class="line"><span style="color: #839496">    s3.create_bucket(</span></span>
<span class="line"><span style="color: #839496">        Bucket </span><span style="color: #859900">=</span><span style="color: #839496"> bucket_name,</span></span>
<span class="line"><span style="color: #839496">        CreateBucketConfiguration</span><span style="color: #859900">=</span><span style="color: #839496">{</span><span style="color: #2AA198">'LocationConstraint'</span><span style="color: #839496">: region})</span></span>
<span class="line"></span>
<span class="line"><span style="color: #859900">except</span><span style="color: #839496"> s3.exceptions.BucketAlreadyOwnedByYou:</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #268BD2">print</span><span style="color: #839496">(</span><span style="color: #2AA198">"Bucket already exists. Using bucket"</span><span style="color: #839496">, bucket_name)</span></span>
Python

We then assign a policy to the bucket to allow AWS Personalize to access it

Python
<span class="line"><span style="color: #839496">s3 </span><span style="color: #859900">=</span><span style="color: #839496"> boto3.client(</span><span style="color: #2AA198">"s3"</span><span style="color: #839496">)</span></span>
<span class="line"><span style="color: #839496">policy </span><span style="color: #859900">=</span><span style="color: #839496"> {</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"Version"</span><span style="color: #839496">: </span><span style="color: #2AA198">"2012-10-17"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"Id"</span><span style="color: #839496">: </span><span style="color: #2AA198">"PersonalizeS3BucketAccessPolicy"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"Statement"</span><span style="color: #839496">: [</span></span>
<span class="line"><span style="color: #839496">        {</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"Sid"</span><span style="color: #839496">: </span><span style="color: #2AA198">"PersonalizeS3BucketAccessPolicy"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"Effect"</span><span style="color: #839496">: </span><span style="color: #2AA198">"Allow"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"Principal"</span><span style="color: #839496">: {</span></span>
<span class="line"><span style="color: #839496">                </span><span style="color: #2AA198">"Service"</span><span style="color: #839496">: </span><span style="color: #2AA198">"personalize.amazonaws.com"</span></span>
<span class="line"><span style="color: #839496">            },</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"Action"</span><span style="color: #839496">: [</span></span>
<span class="line"><span style="color: #839496">                </span><span style="color: #2AA198">"s3:GetObject"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">                </span><span style="color: #2AA198">"s3:ListBucket"</span></span>
<span class="line"><span style="color: #839496">            ],</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"Resource"</span><span style="color: #839496">: [</span></span>
<span class="line"><span style="color: #839496">                </span><span style="color: #2AA198">"arn:aws:s3:::</span><span style="color: #CB4B16">{}</span><span style="color: #2AA198">"</span><span style="color: #839496">.format(bucket_name),</span></span>
<span class="line"><span style="color: #839496">                </span><span style="color: #2AA198">"arn:aws:s3:::</span><span style="color: #CB4B16">{}</span><span style="color: #2AA198">/*"</span><span style="color: #839496">.format(bucket_name)</span></span>
<span class="line"><span style="color: #839496">            ]</span></span>
<span class="line"><span style="color: #839496">        }</span></span>
<span class="line"><span style="color: #839496">    ]</span></span>
<span class="line"><span style="color: #839496">}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">s3.put_bucket_policy(Bucket</span><span style="color: #859900">=</span><span style="color: #839496">bucket_name, Policy</span><span style="color: #859900">=</span><span style="color: #839496">json.dumps(policy))</span></span>

Once the bucket is created we can upload to it the interaction.csv we created earlier. The file will be downloaded and used by AWS Personalize service to do the number crunching.

Python
<span class="line"><span style="color: #839496">boto3.Session().resource(</span><span style="color: #2AA198">'s3'</span><span style="color: #839496">).Bucket(bucket_name).Object(clean_training_data_file_name).upload_file(clean_training_data_file_name)</span></span>
<span class="line"><span style="color: #839496">interactions_s3DataPath </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"s3://"</span><span style="color: #859900">+</span><span style="color: #839496">bucket_name</span><span style="color: #859900">+</span><span style="color: #2AA198">"/"</span><span style="color: #859900">+</span><span style="color: #839496">clean_training_data_file_name</span></span>

After data cleaning/housekeeping we’re ready to assemble the AWS component needed to achieve basic recommendations. The main components are the The dataset group and the recommender.

core AWS Personalize components

And they can be put together like this…

Creating the dataset group. we simply state the domain to which the dataset group belong. Certain dataset groups will require different schemas. We use the resulted arn to query until we know that the group as has been created.

The Dataset Group

Python
<span class="line"><span style="color: #839496">response </span><span style="color: #859900">=</span><span style="color: #839496"> personalize.create_dataset_group(</span></span>
<span class="line"><span style="color: #839496">    name</span><span style="color: #859900">=</span><span style="color: #2AA198">'personalize_ecomemerce_ds_group'</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    domain</span><span style="color: #859900">=</span><span style="color: #2AA198">'ECOMMERCE'</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">dataset_group_arn </span><span style="color: #859900">=</span><span style="color: #839496"> response[</span><span style="color: #2AA198">'datasetGroupArn'</span><span style="color: #839496">]</span></span>

The Schema: The only schema relevant here is the one that describes the interactions data set.

Python
<span class="line"><span style="color: #839496">interactions_schema </span><span style="color: #859900">=</span><span style="color: #839496"> schema </span><span style="color: #859900">=</span><span style="color: #839496"> {</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"type"</span><span style="color: #839496">: </span><span style="color: #2AA198">"record"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"name"</span><span style="color: #839496">: </span><span style="color: #2AA198">"Interactions"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"namespace"</span><span style="color: #839496">: </span><span style="color: #2AA198">"com.amazonaws.personalize.schema"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"fields"</span><span style="color: #839496">: [</span></span>
<span class="line"><span style="color: #839496">        {</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"name"</span><span style="color: #839496">: </span><span style="color: #2AA198">"USER_ID"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"type"</span><span style="color: #839496">: </span><span style="color: #2AA198">"string"</span></span>
<span class="line"><span style="color: #839496">        },</span></span>
<span class="line"><span style="color: #839496">        {</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"name"</span><span style="color: #839496">: </span><span style="color: #2AA198">"ITEM_ID"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"type"</span><span style="color: #839496">: </span><span style="color: #2AA198">"string"</span></span>
<span class="line"><span style="color: #839496">        },</span></span>
<span class="line"><span style="color: #839496">        {</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"name"</span><span style="color: #839496">: </span><span style="color: #2AA198">"TIMESTAMP"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"type"</span><span style="color: #839496">: </span><span style="color: #2AA198">"long"</span></span>
<span class="line"><span style="color: #839496">        },</span></span>
<span class="line"><span style="color: #839496">        {</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"name"</span><span style="color: #839496">: </span><span style="color: #2AA198">"EVENT_TYPE"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"type"</span><span style="color: #839496">: </span><span style="color: #2AA198">"string"</span></span>
<span class="line"><span style="color: #839496">            </span></span>
<span class="line"><span style="color: #839496">        }</span></span>
<span class="line"><span style="color: #839496">    ],</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"version"</span><span style="color: #839496">: </span><span style="color: #2AA198">"1.0"</span></span>
<span class="line"><span style="color: #839496">}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">create_schema_response </span><span style="color: #859900">=</span><span style="color: #839496"> personalize.create_schema(</span></span>
<span class="line"><span style="color: #839496">    name </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"personalize-ecommerce-interatn_group"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    domain </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"ECOMMERCE"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    schema </span><span style="color: #859900">=</span><span style="color: #839496"> json.dumps(interactions_schema)</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">interaction_schema_arn </span><span style="color: #859900">=</span><span style="color: #839496"> create_schema_response[</span><span style="color: #2AA198">'schemaArn'</span><span style="color: #839496">]</span></span>

Now that we have the Dataset Group and a Schema to describe a database we can can create the Dataset. We specify the dataset type that we want to create. In this case is of type “INTERACTIONS”.

Python
<span class="line"><span style="color: #839496">dataset_type </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"INTERACTIONS"</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">create_dataset_response </span><span style="color: #859900">=</span><span style="color: #839496"> personalize.create_dataset(</span></span>
<span class="line"><span style="color: #839496">    name </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"personalize_ecommerce_demo_interactions"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    datasetType </span><span style="color: #859900">=</span><span style="color: #839496"> dataset_type,</span></span>
<span class="line"><span style="color: #839496">    datasetGroupArn </span><span style="color: #859900">=</span><span style="color: #839496"> dataset_group_arn,</span></span>
<span class="line"><span style="color: #839496">    schemaArn </span><span style="color: #859900">=</span><span style="color: #839496"> interaction_schema_arn</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">interactions_dataset_arn </span><span style="color: #859900">=</span><span style="color: #839496"> create_dataset_response[</span><span style="color: #2AA198">'datasetArn'</span><span style="color: #839496">]</span></span>

Create the personalize role

Python
<span class="line"><span style="color: #839496">iam </span><span style="color: #859900">=</span><span style="color: #839496"> boto3.client(</span><span style="color: #2AA198">"iam"</span><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">role_name </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"PersonalizeRoleEcommerceDemoRecommender"</span></span>
<span class="line"><span style="color: #839496">assume_role_policy_document </span><span style="color: #859900">=</span><span style="color: #839496"> {</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"Version"</span><span style="color: #839496">: </span><span style="color: #2AA198">"2012-10-17"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"Statement"</span><span style="color: #839496">: [</span></span>
<span class="line"><span style="color: #839496">        {</span></span>
<span class="line"><span style="color: #839496">          </span><span style="color: #2AA198">"Effect"</span><span style="color: #839496">: </span><span style="color: #2AA198">"Allow"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">          </span><span style="color: #2AA198">"Principal"</span><span style="color: #839496">: {</span></span>
<span class="line"><span style="color: #839496">            </span><span style="color: #2AA198">"Service"</span><span style="color: #839496">: </span><span style="color: #2AA198">"personalize.amazonaws.com"</span></span>
<span class="line"><span style="color: #839496">          },</span></span>
<span class="line"><span style="color: #839496">          </span><span style="color: #2AA198">"Action"</span><span style="color: #839496">: </span><span style="color: #2AA198">"sts:AssumeRole"</span></span>
<span class="line"><span style="color: #839496">        }</span></span>
<span class="line"><span style="color: #839496">    ]</span></span>
<span class="line"><span style="color: #839496">}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">create_role_response </span><span style="color: #859900">=</span><span style="color: #839496"> iam.create_role(</span></span>
<span class="line"><span style="color: #839496">    RoleName </span><span style="color: #859900">=</span><span style="color: #839496"> role_name,</span></span>
<span class="line"><span style="color: #839496">    AssumeRolePolicyDocument </span><span style="color: #859900">=</span><span style="color: #839496"> json.dumps(assume_role_policy_document)</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #586E75; font-style: italic"># AmazonPersonalizeFullAccess provides access to any S3 bucket with a name that includes "personalize" or "Personalize" </span></span>
<span class="line"><span style="color: #586E75; font-style: italic"># if you would like to use a bucket with a different name, please consider creating and attaching a new policy</span></span>
<span class="line"><span style="color: #586E75; font-style: italic"># that provides read access to your bucket or attaching the AmazonS3ReadOnlyAccess policy to the role</span></span>
<span class="line"><span style="color: #839496">policy_arn </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"arn:aws:iam::aws:policy/service-role/AmazonPersonalizeFullAccess"</span></span>
<span class="line"><span style="color: #839496">iam.attach_role_policy(</span></span>
<span class="line"><span style="color: #839496">    RoleName </span><span style="color: #859900">=</span><span style="color: #839496"> role_name,</span></span>
<span class="line"><span style="color: #839496">    PolicyArn </span><span style="color: #859900">=</span><span style="color: #839496"> policy_arn</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #586E75; font-style: italic"># Now add S3 support</span></span>
<span class="line"><span style="color: #839496">iam.attach_role_policy(</span></span>
<span class="line"><span style="color: #839496">    PolicyArn</span><span style="color: #859900">=</span><span style="color: #2AA198">'arn:aws:iam::aws:policy/AmazonS3FullAccess'</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    RoleName</span><span style="color: #859900">=</span><span style="color: #839496">role_name</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"><span style="color: #839496">time.sleep(</span><span style="color: #D33682">60</span><span style="color: #839496">) </span><span style="color: #586E75; font-style: italic"># wait for a minute to allow IAM role policy attachment to propagate</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">role_arn </span><span style="color: #859900">=</span><span style="color: #839496"> create_role_response[</span><span style="color: #2AA198">"Role"</span><span style="color: #839496">][</span><span style="color: #2AA198">"Arn"</span><span style="color: #839496">]</span></span>

Now that we have the Role, the Dataset and the S3 bucket where the interaction data is kept, we can create the Dataset import job

Python
<span class="line"><span style="color: #839496">create_interactions_dataset_import_job_response </span><span style="color: #859900">=</span><span style="color: #839496"> personalize.create_dataset_import_job(</span></span>
<span class="line"><span style="color: #839496">    jobName </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"personalize_ecommerce_demo_interactions_import"</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">    datasetArn </span><span style="color: #859900">=</span><span style="color: #839496"> interactions_dataset_arn,</span></span>
<span class="line"><span style="color: #839496">    dataSource </span><span style="color: #859900">=</span><span style="color: #839496"> {</span></span>
<span class="line"><span style="color: #839496">        </span><span style="color: #2AA198">"dataLocation"</span><span style="color: #839496">: </span><span style="color: #2AA198">"s3://</span><span style="color: #CB4B16">{}</span><span style="color: #2AA198">/</span><span style="color: #CB4B16">{}</span><span style="color: #2AA198">"</span><span style="color: #839496">.format(bucket_name, interactions_file_path)</span></span>
<span class="line"><span style="color: #839496">    },</span></span>
<span class="line"><span style="color: #839496">    roleArn </span><span style="color: #859900">=</span><span style="color: #839496"> role_arn</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">dataset_interactions_import_job_arn </span><span style="color: #859900">=</span><span style="color: #839496"> create_interactions_dataset_import_job_response[</span><span style="color: #2AA198">'datasetImportJobArn'</span><span style="color: #839496">]</span></span>

Create the recommender

Python
<span class="line"><span style="color: #839496">create_recommender_response </span><span style="color: #859900">=</span><span style="color: #839496"> personalize.create_recommender(</span></span>
<span class="line"><span style="color: #839496">  name </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">'viewed_x_also_viewed_demo'</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">  recipeArn </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">'arn:aws:personalize:::recipe/aws-ecomm-customers-who-viewed-x-also-viewed'</span><span style="color: #839496">,</span></span>
<span class="line"><span style="color: #839496">  datasetGroupArn </span><span style="color: #859900">=</span><span style="color: #839496"> dataset_group_arn</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"><span style="color: #839496">viewed_x_also_viewed_arn </span><span style="color: #859900">=</span><span style="color: #839496"> create_recommender_response[</span><span style="color: #2AA198">"recommenderArn"</span><span style="color: #839496">]</span></span>

Query the recommender: once the recommender has been created, this means that we’re finally able to use it.

Lets create a helper function to get us a description of a product based on ID.

Python
<span class="line"><span style="color: #586E75; font-style: italic"># reading the original data in order to have a dataframe that has both item_ids </span></span>
<span class="line"><span style="color: #586E75; font-style: italic"># and the corresponding titles to make out recommendations easier to read.</span></span>
<span class="line"><span style="color: #839496">items_df </span><span style="color: #859900">=</span><span style="color: #839496"> pd.read_csv(</span><span style="color: #2AA198">'./items.csv'</span><span style="color: #839496">)</span></span>
<span class="line"><span style="color: #839496">items_df.sample(</span><span style="color: #D33682">10</span><span style="color: #839496">)</span></span>
<span class="line"><span style="color: #93A1A1; font-weight: bold">def</span><span style="color: #839496"> </span><span style="color: #268BD2">get_item_by_id</span><span style="color: #839496">(item_id, item_df):</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #2AA198">"""</span></span>
<span class="line"><span style="color: #2AA198">    This takes in an item_id from a recommendation in string format,</span></span>
<span class="line"><span style="color: #2AA198">    converts it to an int, and then does a lookup in a default or specified</span></span>
<span class="line"><span style="color: #2AA198">    dataframe and returns the item description.</span></span>
<span class="line"><span style="color: #2AA198">    </span></span>
<span class="line"><span style="color: #2AA198">    A really broad try/except clause was added in case anything goes wrong.</span></span>
<span class="line"><span style="color: #2AA198">    </span></span>
<span class="line"><span style="color: #2AA198">    Feel free to add more debugging or filtering here to improve results if</span></span>
<span class="line"><span style="color: #2AA198">    you hit an error.</span></span>
<span class="line"><span style="color: #2AA198">    """</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #859900">try</span><span style="color: #839496">:</span></span>
<span class="line"><span style="color: #839496">        </span><span style="color: #859900">return</span><span style="color: #839496"> items_df.loc[items_df[</span><span style="color: #2AA198">"ITEM_ID"</span><span style="color: #839496">]</span><span style="color: #859900">==str</span><span style="color: #839496">(item_id)][</span><span style="color: #2AA198">'PRODUCT_DESCRIPTION'</span><span style="color: #839496">].values[</span><span style="color: #D33682">0</span><span style="color: #839496">]</span></span>
<span class="line"><span style="color: #839496">    </span><span style="color: #859900">except</span><span style="color: #839496">:</span></span>
<span class="line"><span style="color: #839496">        </span><span style="color: #268BD2">print</span><span style="color: #839496"> (item_id)</span></span>
<span class="line"><span style="color: #839496">        </span><span style="color: #859900">return</span><span style="color: #839496"> </span><span style="color: #2AA198">"Error obtaining item description"</span></span>

Querying the “Customers who viewed X also viewed” Recommender:

Python
<span class="line"><span style="color: #586E75; font-style: italic"># First pick a user</span></span>
<span class="line"><span style="color: #839496">test_user_id </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"777"</span></span>
<span class="line"></span>
<span class="line"><span style="color: #586E75; font-style: italic"># Select a random item</span></span>
<span class="line"><span style="color: #839496">test_item_id </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #2AA198">"8fbe091c-f73c-4727-8fe7-d27eabd17bea"</span><span style="color: #839496"> </span><span style="color: #586E75; font-style: italic"># a random item: 8fbe091c-f73c-4727-8fe7-d27eabd17bea</span></span>
<span class="line"></span>
<span class="line"><span style="color: #586E75; font-style: italic"># Get recommendations for the user for this item</span></span>
<span class="line"><span style="color: #839496">get_recommendations_response </span><span style="color: #859900">=</span><span style="color: #839496"> personalize_runtime.get_recommendations(</span></span>
<span class="line"><span style="color: #839496">    recommenderArn </span><span style="color: #859900">=</span><span style="color: #839496"> viewed_x_also_viewed_arn,</span></span>
<span class="line"><span style="color: #839496">    itemId </span><span style="color: #859900">=</span><span style="color: #839496"> test_item_id,</span></span>
<span class="line"><span style="color: #839496">    userId </span><span style="color: #859900">=</span><span style="color: #839496"> test_user_id,</span></span>
<span class="line"><span style="color: #839496">    numResults </span><span style="color: #859900">=</span><span style="color: #839496"> </span><span style="color: #D33682">10</span></span>
<span class="line"><span style="color: #839496">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #586E75; font-style: italic"># Build a new dataframe for the recommendations</span></span>
<span class="line"><span style="color: #839496">item_list </span><span style="color: #859900">=</span><span style="color: #839496"> get_recommendations_response[</span><span style="color: #2AA198">'itemList'</span><span style="color: #839496">]</span></span>
<span class="line"><span style="color: #839496">recommendation_list </span><span style="color: #859900">=</span><span style="color: #839496"> []</span></span>
<span class="line"></span>
<span class="line"><span style="color: #859900">for</span><span style="color: #839496"> item </span><span style="color: #859900">in</span><span style="color: #839496"> item_list:</span></span>
<span class="line"><span style="color: #839496">    item </span><span style="color: #859900">=</span><span style="color: #839496"> get_item_by_id(item[</span><span style="color: #2AA198">'itemId'</span><span style="color: #839496">], items_df)</span></span>
<span class="line"><span style="color: #839496">    recommendation_list.append(item)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">user_recommendations_df </span><span style="color: #859900">=</span><span style="color: #839496"> pd.DataFrame(recommendation_list, columns </span><span style="color: #859900">=</span><span style="color: #839496"> [get_item_by_id(test_item_id, items_df)])</span></span>
<span class="line"></span>
<span class="line"><span style="color: #839496">pd.options.display.max_rows </span><span style="color: #859900">=</span><span style="color: #D33682">10</span></span>
<span class="line"><span style="color: #839496">display(user_recommendations_df)</span></span>

and there you have it we can now query our engine to get recommendations for a specific user.

clean up you resources on aws.

Make sure you clean up the resources that use to not incure unnecessary costs.